The NYTimes Created a Documentation Tool That Renders Your Content From Google Docs

Here’s how you can implement it in your organization or project

The NYTimes Library App

If you work with documentation, you know it is hard to maintain it.

papers

The NYTimes faced this issue, so they created a collaborative documentation site that renders content from existing Google Docs.

The best part is that they decided to make it open source, so I decided to give it a try.

The easiest way to implement it is using Heroku, and that was my first approach. But –because I was using my personal Heroku account– it wouldn’t be the best approach to implement it for the whole team at my organization.

I shared the idea with my team, and they suggested to use Docker and the AWS Cloud to host the application. That way, the app is no longer tied to a single member of the team.

I describe both approaches below. The whole process consists of:

Cloning the Repository

To get a copy of the repository in your local machine,

  1. Run git clone git@github.com:nytimes/library.git.
  2. Move into the directory with cd library.
  3. Install the dependencies with: npm install --no-optional.
  4. Create a .env file at the project root, with the following structure:
    APPROVED_DOMAINS=yourorganization.com
    DRIVE_ID=<the ID of your team's drive or shared folder>
    DRIVE_TYPE=<'team' or 'folder'>
    GOOGLE_APPLICATION_JSON=<The JSON file you get when creating your Google service account>
    GCP_PROJECT_ID=<your Google Cloud Project ID>
    GOOGLE_APPLICATION_CREDENTIALS=parse_json
    GOOGLE_CLIENT_ID=<Your Google OAuth Client ID>
    GOOGLE_CLIENT_SECRET=<Your Google OAuth Client Secret>
    

    NOTE: You will get the values for the .env variables when configuring your Google Cloud Platform

  5. Once you configured your Google Cloud Platform you can update the .env file with the correspondig values and run npm run build && npm run watch to run the application locally.

Configuring your Google Cloud Platform

Because the application renders the content from Google Docs, it needs access to those documents through the Google APIs. These are the steps to configure a Google Cloud Platform project to communicate with your library application.

I. Create a Google Cloud Platform Project

To create a new GCP project,

  1. Go to https://console.cloud.google.com/projectcreate.
  2. Give your project a name (you can add your project to an organization, but it’s not required).
  3. Click Create.

When done, you should be able to see and select your project from the dropdown at the top bar.

II. Create a Service Account

To create the service account for your Library instance,

  1. Click the menu button on the left of the top bar and navigate to
    APIs & ServicesCredentials.
  2. Click the Create credentials button and select Service account key from the options.
  3. Click the dropdown and select New service account.
  4. Provide a name to the service account and, from the Role dropdown, go to Datastore, and select Cloud Datastore User.
  5. Leave the JSON format selected and click Create.

    IMPORTANT: Save the JSON file in a secure location.

  6. Click again the menu button at the top bar and navigate to
    IAM & adminService accounts.
  7. Copy the service account email associated to your project.
  8. Go to your shared folder in Google Drive.
  9. From the dropdown menu with the name of your shared folder, select Share, paste the service account email in the text box, and click Done.

    TIP: Unclick the Notify people box in the lower right corner of the popup window to avoid getting a Delivery status failure notification email from Google.

III. Enable and Configure API services

To enable the APIs in your Google Console,

  1. Click the menu button on the top bar.
  2. Navigate to APIs & ServicesLibrary.
  3. Search for the Google Drive API and click it when it appears.
  4. Click the Enable button.
  5. Go back to the Library and search for Cloud Datastore API.
  6. Click it to open and then click the Enable button.

IV. Set Up the Cloud Datastore API

To enable the Cloud Datastore API,

  1. Install the Google Cloud SDK corresponding to your OS.
  2. Open a Terminal window and run: gcloud auth login to authenticate your account with the SDK.
  3. Back in your Google console, click the menu button, and navigate to Datastore –under Storage–, and select Entities.
  4. Click the SELECT DATASTORE MODE button and select a region.
  5. Click Create Database.
  6. Click the menu on the top bar and navigate to IAM & adminSettings. Copy the Project ID value.
  7. In your terminal window, change directory to your Library project, and run the following command: gcloud datastore indexes create ./config/index.yaml --project <your Project ID>

V. Create an OAuth Client

To set up an OAuth 2.0 Client,

  1. Click the menu on the top bar and navigate to APIs & ServicesCredentials.
  2. Click the Create credentials dropdown and select OAuth client ID.
  3. Select Web application for the Application type.
  4. Give it a name and click the Create button.

Deploying to Heroku

The NYTimes team created a Deploy to Heroku button which takes you to a Heroku configuration page where you can enter the values required to set up and create your instance of the Library app.

  1. Sign up or log into your Heroku account.
  2. Go to the NYTimes Library app GitHub repository and click the Deploy to Heroku button.
    You’ll be prompted to enter the environment variables to set up your instance of the app.
  3. Fill the values with the corresponding information:
    • In APPROVED_DOMAINS you can enter the @domain.com corresponding to your organization. It can be a comma separated list of domains.
      NOTE: Only people with those email domains will be able to login and see your Library instance.
    • The DRIVE_ID is your Google Drive shared folder ID.
      NOTE: The ID is the string at the end of your team drive or shared folder’s URL: https://drive.google.com/drive/folders/<your_shared_folder_ID>.
    • The DRIVE_TYPE value can be team or folder, depending on whether your using a shared folder or a team drive.
    • The GCP_PROJECT_ID is the project ID you get when setting up the Cloud Datastore API.
    • GOOGLE_APPLICATION_CREDENTIALS should always be set to parse_json.
    • The GOOGLE_APPLICATION_JSON value is the JSON you get when Creating a Service Account.
    • The GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET are the values from the OAuth client configuration.
  4. Click the Deploy app button.

That’s it, Heroku deploys your application and gives you the corresponding public URL where you can visit your application.

This is the NYTimes documentation on how to configure the Google Cloud Platform and deploy to Heroku.

Deploying to AWS

If you don’t want to use Heroku, you can create a Docker image of the application to deploy somewhere else.

In this case, I used the Amazon Elastic Container Registry (ECR) to host the Docker image and Amazon Elastic Compute Cloud to deploy it. In the following sections you can see the steps to do so.

Create a Docker Image

To create the Docker image of the app,

  1. Install Docker if you don’t have it already.
  2. Go to your directory where you cloned the Library repository. There should be a Dockerfile, which should look like this:
    FROM nytimes/library
    
    # install custom files
    COPY . ./custom/
    
    # install custom npm packages
    WORKDIR /usr/src/tmp
    COPY package*.json ./
    RUN npm i
    RUN yes | cp -rf ./node_modules/* /usr/src/app/node_modules
    
    # return to app and build
    WORKDIR /usr/src/app
    RUN npm run build
    
    # start app
    CMD [ "npm", "start" ]
    
  3. From a Terminal window at your project directory, run the following command: docker build -t <the_name_you_want_for_your_image> .
  4. If you want to run your image locally, you’ll need to add a hostname to your local hosts file so you can add it to the Google Approved Domains list on your Google Cloud Platform. To do so:
    • Open a Terminal window
    • Type cat /etc/hosts
    • Edit the file adding the name you want for your localhost
    • Type Ctrl-d to save the file
    • Go to your Google Console
    • From the main menu, navigate to APIs & ServicesCredentials
    • Click the OAuth consent screen tab
    • Add the domain under the Authorized domains section, hit Enter, and click Save
    • Go back to the Credentials tab, click the name of your OAuth Client and add the same domain with the following structure: https://yourdomain:port/auth/redirect under the Authorized redirect URIs section and click Save
  5. Run the following command to run your image locally: docker run --env-file ./.env -p 8888:8000 <your_image>

    Your image should be running at the specified port.

Push your Image to AWS ECR

To upload your Docker image to AWS ECR and be able to use it later,

  1. Go to your AWS Console
  2. From the Services dropdown at the top bar, search for ECR and select it to open
  3. Click the Get Started button
  4. Name your repository –preferably use the same name you used for your image– and click Create repository Your repository gets created and you are redirected to a screen with the repository name and its URI
  5. Click the repository name
  6. Click the View push commands button
  7. Open a Terminal window and run the first command, which is like this one: (aws ecr get-login --no-include-email --region us-east-1)
    Region can change depending on which AWS region you’re using
    NOTE: You need to have the AWS CLI installed in your computer The response is the command you have to use to authenticate your Docker client to your AWS registry
  8. Copy the command provided in the response and run it
  9. Since you already have your image, you can skip the build command and run the tag command, which is something like this: docker tag <your_image_name>:latest 866674269210.dkr.ecr.us-east-1.amazonaws.com/<your_repo_name>:latest
  10. Run the last command to push your image to the registry: docker push 866674269210.dkr.ecr.us-east-1.amazonaws.com/<your_repo_name>:latest

Host your Image on AWS ECS

To host your Image and deploy with AWS ECS and Fargate,

  1. Go to your AWS Console
  2. From the Services dropdown at the top bar, choose ECS
  3. Click Get Started
  4. Navigate to the custom panel and click Configure
  5. Fill the required details:
    • Give a name to the container
    • In the Image box, paste your repository URI from the Push your Image to AWS ECR section
    • In the Port mappings put 3000 or the port number you want your app to listen to
    • Click Advanced container configuration to expand the section
    • In the Environment section, add 1024 CPU units
    • Fill in the Env Variables values with the key-value pairs from the .env file.
      Use a secret manager service for sensitive values such as secret keys
    • In the Storage and Logging section, change the awslogs-group value to /ecs/library-app-task
    • Click the Update button
      You’ll see a warning message about the Task CPU. You’ll fix that on the next step
  6. Navigate to the Task definition section and click the Edit button
    • Rename the Task definition name to library-app-task
    • Change Task memory value to 4GB
    • Change Task CPU value to 2vCPU
    • Click Save and then click Next
  7. In the Define your service section, select the Application Load Balancer option for the Load balancer type and click Next
  8. Name your cluster library-app-cluster and click Next
    You can see your services being launched. It can take a few minutes to finish
  9. Once the process is complete, click View service
  10. Click the Target Group Name under Load Balancing
    You’re redirected to your EC2 management console
  11. In the Description tab, scroll down and click the Load balancer name
    You’re redirected to the Load balancer details
  12. In the Description tab, scroll down and copy the DNS name
  13. Append :3000 at the end –or the port you chose earlier– and visit that address

That’s it! Your Library application is up and running!