The NYTimes Created a Documentation Tool That Renders Your Content From Google Docs
Here’s how you can implement it in your organization or project
The NYTimes Library App
If you work with documentation, you know it is hard to maintain it.
The NYTimes faced this issue, so they created a collaborative documentation site that renders content from existing Google Docs.
The best part is that they decided to make it open source, so I decided to give it a try.
The easiest way to implement it is using Heroku, and that was my first approach. But –because I was using my personal Heroku account– it wouldn’t be the best approach to implement it for the whole team at my organization.
I shared the idea with my team, and they suggested to use Docker and the AWS Cloud to host the application. That way, the app is no longer tied to a single member of the team.
I describe both approaches below. The whole process consists of:
- Cloning the repository
- Configuring your Google Cloud Platform
- Deploying to Heroku or AWS using a Docker image
Cloning the Repository
To get a copy of the repository in your local machine,
- Run
git clone git@github.com:nytimes/library.git
. - Move into the directory with
cd library
. - Install the dependencies with:
npm install --no-optional
. - Create a
.env
file at the project root, with the following structure:APPROVED_DOMAINS=yourorganization.com DRIVE_ID=<the ID of your team's drive or shared folder> DRIVE_TYPE=<'team' or 'folder'> GOOGLE_APPLICATION_JSON=<The JSON file you get when creating your Google service account> GCP_PROJECT_ID=<your Google Cloud Project ID> GOOGLE_APPLICATION_CREDENTIALS=parse_json GOOGLE_CLIENT_ID=<Your Google OAuth Client ID> GOOGLE_CLIENT_SECRET=<Your Google OAuth Client Secret>
NOTE: You will get the values for the
.env
variables when configuring your Google Cloud Platform - Once you configured your Google Cloud Platform you can update the
.env
file with the correspondig values and runnpm run build && npm run watch
to run the application locally.
Configuring your Google Cloud Platform
Because the application renders the content from Google Docs, it needs access to those documents through the Google APIs. These are the steps to configure a Google Cloud Platform project to communicate with your library application.
I. Create a Google Cloud Platform Project
To create a new GCP project,
- Go to https://console.cloud.google.com/projectcreate.
- Give your project a name (you can add your project to an organization, but it’s not required).
- Click Create.
When done, you should be able to see and select your project from the dropdown at the top bar.
II. Create a Service Account
To create the service account for your Library instance,
- Click the menu button on the left of the top bar and navigate to
APIs & Services ➡ Credentials. - Click the Create credentials button and select Service account key from the options.
- Click the dropdown and select New service account.
- Provide a name to the service account and, from the Role dropdown, go to Datastore, and select Cloud Datastore User.
- Leave the JSON format selected and click Create.
IMPORTANT: Save the JSON file in a secure location.
- Click again the menu button at the top bar and navigate to
IAM & admin ➡ Service accounts. - Copy the service account email associated to your project.
- Go to your shared folder in Google Drive.
- From the dropdown menu with the name of your shared folder, select Share, paste the service account email in the text box, and click Done.
TIP: Unclick the Notify people box in the lower right corner of the popup window to avoid getting a Delivery status failure notification email from Google.
III. Enable and Configure API services
To enable the APIs in your Google Console,
- Click the menu button on the top bar.
- Navigate to APIs & Services ➡ Library.
- Search for the Google Drive API and click it when it appears.
- Click the Enable button.
- Go back to the Library and search for Cloud Datastore API.
- Click it to open and then click the Enable button.
IV. Set Up the Cloud Datastore API
To enable the Cloud Datastore API,
- Install the Google Cloud SDK corresponding to your OS.
- Open a Terminal window and run:
gcloud auth login
to authenticate your account with the SDK. - Back in your Google console, click the menu button, and navigate to Datastore –under Storage–, and select Entities.
- Click the SELECT DATASTORE MODE button and select a region.
- Click Create Database.
- Click the menu on the top bar and navigate to IAM & admin ➡ Settings. Copy the Project ID value.
- In your terminal window, change directory to your Library project, and run the following command:
gcloud datastore indexes create ./config/index.yaml --project <your Project ID>
V. Create an OAuth Client
To set up an OAuth 2.0 Client,
- Click the menu on the top bar and navigate to APIs & Services ➡ Credentials.
- Click the Create credentials dropdown and select OAuth client ID.
- Select Web application for the Application type.
- Give it a name and click the Create button.
Deploying to Heroku
The NYTimes team created a Deploy to Heroku button which takes you to a Heroku configuration page where you can enter the values required to set up and create your instance of the Library app.
- Sign up or log into your Heroku account.
- Go to the NYTimes Library app GitHub repository and click the Deploy to Heroku button.
You’ll be prompted to enter the environment variables to set up your instance of the app. - Fill the values with the corresponding information:
- In
APPROVED_DOMAINS
you can enter the@domain.com
corresponding to your organization. It can be a comma separated list of domains.
NOTE: Only people with those email domains will be able to login and see your Library instance. - The
DRIVE_ID
is your Google Drive shared folder ID.
NOTE: The ID is the string at the end of your team drive or shared folder’s URL:https://drive.google.com/drive/folders/<your_shared_folder_ID>
. - The
DRIVE_TYPE
value can beteam
orfolder
, depending on whether your using a shared folder or a team drive. - The
GCP_PROJECT_ID
is the project ID you get when setting up the Cloud Datastore API. GOOGLE_APPLICATION_CREDENTIALS
should always be set toparse_json
.- The
GOOGLE_APPLICATION_JSON
value is the JSON you get when Creating a Service Account. - The
GOOGLE_CLIENT_ID
andGOOGLE_CLIENT_SECRET
are the values from the OAuth client configuration.
- In
- Click the Deploy app button.
That’s it, Heroku deploys your application and gives you the corresponding public URL where you can visit your application.
This is the NYTimes documentation on how to configure the Google Cloud Platform and deploy to Heroku.
Deploying to AWS
If you don’t want to use Heroku, you can create a Docker image of the application to deploy somewhere else.
In this case, I used the Amazon Elastic Container Registry (ECR) to host the Docker image and Amazon Elastic Compute Cloud to deploy it. In the following sections you can see the steps to do so.
Create a Docker Image
To create the Docker image of the app,
- Install Docker if you don’t have it already.
- Go to your directory where you cloned the Library repository. There should be a
Dockerfile
, which should look like this:FROM nytimes/library # install custom files COPY . ./custom/ # install custom npm packages WORKDIR /usr/src/tmp COPY package*.json ./ RUN npm i RUN yes | cp -rf ./node_modules/* /usr/src/app/node_modules # return to app and build WORKDIR /usr/src/app RUN npm run build # start app CMD [ "npm", "start" ]
- From a Terminal window at your project directory, run the following command:
docker build -t <the_name_you_want_for_your_image> .
- If you want to run your image locally, you’ll need to add a hostname to your local hosts file so you can add it to the Google Approved Domains list on your Google Cloud Platform. To do so:
- Open a Terminal window
- Type
cat /etc/hosts
- Edit the file adding the name you want for your localhost
- Type
Ctrl-d
to save the file - Go to your Google Console
- From the main menu, navigate to APIs & Services ➡ Credentials
- Click the OAuth consent screen tab
- Add the domain under the Authorized domains section, hit Enter, and click Save
- Go back to the Credentials tab, click the name of your OAuth Client and add the same domain with the following structure:
https://yourdomain:port/auth/redirect
under the Authorized redirect URIs section and click Save
- Run the following command to run your image locally:
docker run --env-file ./.env -p 8888:8000 <your_image>
Your image should be running at the specified port.
Push your Image to AWS ECR
To upload your Docker image to AWS ECR and be able to use it later,
- Go to your AWS Console
- From the Services dropdown at the top bar, search for ECR and select it to open
- Click the Get Started button
- Name your repository –preferably use the same name you used for your image– and click Create repository Your repository gets created and you are redirected to a screen with the repository name and its URI
- Click the repository name
- Click the View push commands button
- Open a Terminal window and run the first command, which is like this one:
(aws ecr get-login --no-include-email --region us-east-1)
Region can change depending on which AWS region you’re using
NOTE: You need to have the AWS CLI installed in your computer The response is the command you have to use to authenticate your Docker client to your AWS registry - Copy the command provided in the response and run it
- Since you already have your image, you can skip the build command and run the tag command, which is something like this:
docker tag <your_image_name>:latest 866674269210.dkr.ecr.us-east-1.amazonaws.com/<your_repo_name>:latest
- Run the last command to push your image to the registry:
docker push 866674269210.dkr.ecr.us-east-1.amazonaws.com/<your_repo_name>:latest
Host your Image on AWS ECS
To host your Image and deploy with AWS ECS and Fargate,
- Go to your AWS Console
- From the Services dropdown at the top bar, choose ECS
- Click Get Started
- Navigate to the custom panel and click Configure
- Fill the required details:
- Give a name to the container
- In the Image box, paste your repository URI from the Push your Image to AWS ECR section
- In the Port mappings put
3000
or the port number you want your app to listen to - Click Advanced container configuration to expand the section
- In the Environment section, add
1024
CPU units - Fill in the Env Variables values with the key-value pairs from the
.env
file.
Use a secret manager service for sensitive values such as secret keys - In the Storage and Logging section, change the
awslogs-group
value to/ecs/library-app-task
- Click the Update button
You’ll see a warning message about the Task CPU. You’ll fix that on the next step
- Navigate to the Task definition section and click the Edit button
- Rename the Task definition name to
library-app-task
- Change Task memory value to
4GB
- Change Task CPU value to
2vCPU
- Click Save and then click Next
- Rename the Task definition name to
- In the Define your service section, select the Application Load Balancer option for the Load balancer type and click Next
- Name your cluster
library-app-cluster
and click Next
You can see your services being launched. It can take a few minutes to finish - Once the process is complete, click View service
- Click the Target Group Name under Load Balancing
You’re redirected to your EC2 management console - In the Description tab, scroll down and click the Load balancer name
You’re redirected to the Load balancer details - In the Description tab, scroll down and copy the DNS name
- Append
:3000
at the end –or the port you chose earlier– and visit that address
That’s it! Your Library application is up and running!