How to use gatsby-source-s3 in your next Gatsby project
April 23rd, 2020
gatsby-source-s3, what is it?
gatsby-source-s3 is a plugin that enables you to access your AWS S3 Bucket and query the assets within it using GraphQL. Cool, so why use this? For me, I am using this alongside gatsby-transformer-sharp and gatsby-source-filesystem, when combined these enable me to import images from my bucket directly into my project and then process them for this blog. I can compress images, convert them to .webp and serve them to devices that support it, along with much more.
In this walk-through, we will solely focus on getting you to successfully query your S3 Bucket and return all the assets within it.
Prerequisites
In order to follow along you will first need to make sure you have a few things set up.
-
Have the Gatsby-CLI installed. We need this in order to perform tasks such as; run our dev server, build our project or download Gatsby Starters.
-
Have NPM installed. NPM is a package manager and it is what allows us to download plugins for our projects.
-
A S3 Bucket with some items; images, text files, really anything. If you're just starting out with AWS, they have some great free-tier options which will work for the sake of this tutorial. If you're creating a new Bucket, remember the name of your Bucket. You will need it later.
-
Your AWS Access Key ID and Secret Access Key. These are needed in order to give the plugin access to your S3 Bucket.
-
And obviously a project to work with.
- If you don't currently have a project started you can use the Gatbsy hello-world Starter. This is what I will be using for this tutorial.
Getting your starter project
If you already have a project feel free to skip this section.
To get your Gatsby starter, run the command below. This is where the Gatsby-CLI comes into play. This will create a new Gatsby project for you titled my-hello-world-starter
and it will use the starter project gatsby-starter-hello-world
.
gatsby new my-hello-world-starter https://github.com/gatsbyjs/gatsby-starter-hello-world
Once Gatsby has finished downloading your new project you should be presented with a message similar to the following
Your new Gatsby site has been successfully bootstrapped. Start developing it by running:
cd my-hello-world-starter
gatsby develop
Be sure to move into your new project directory cd my-hello-world-starter
as this is where we will be working for the remainder of this tutorial.
Installing gatsby-source-s3
First we will need to install the plugin into our project using NPM.
npm install gatsby-source-s3 --save-dev
The --save-dev
saves the plugin as a development dependency
Your package.json
should now have the gatsby-source-s3
plugin listed as a dev dependency.
If you want to learn more about what a package.json file is, check out NPM's description here or this one by NodeJS.
"devDependencies": {
"gatsby-source-s3": "0.0.0",
},
Plugin Setup
Once the plugin has been installed we need to tell Gatsby to use this plugin. We do that by including it in our gatsby-config.js
. The gatsby-config.js
file is a file that is unique to Gatsby. It is where you save information such a metadata for you site and define your plugins.
Gatsby docs discussing gatsby-config.js
Inside of gatsby-config.js
there will be an array with the name plugins
. This is where you add all the plugins in which you want your Gatsby site to use. If you're using the Gatbsy hello-world Starter, this array will be empty.
module.exports = {
plugins: [],
}
Add the following to this plugins
array.
{
resolve: "gatsby-source-s3",
options: {
aws: {
// leave these empty for now
accessKeyId: "",
secretAccessKey: "",
},
buckets: ["bucket-name"],
},
},
For the time being, leave the accessKeyId
and secretAccessKey
empty. We will address these two next. In the buckets
parameter, simply add the name of your S3 Bucket. You can find this within your S3 console.
Your gatsby-config.js
should now look something like this.
module.exports = {
plugins: [
{
resolve: 'gatsby-source-s3',
options: {
aws: {
// leave these empty for now
accessKeyId: '',
secretAccessKey: '',
},
buckets: ['bucket-name'],
},
},
],
}
Safely securing AWS keys
Why did I have you leave your AWS access key and secret access key empty? Because we don't want to save these directly into our gatsby-config.js
. Why? Because chances are you will be saving this project to some form of source control. Whether that be GitHub, GitLab, BitBucket, etc. and we don't want these keys publicly available. Even if you plan on having this project as a private repo, it's best practice to not commit this sensitive data.
To fix this, we want to store these keys in a different file that will be untracked by Git. These are called Environment or Config Variables and are typically saved in a file called .env
. You'll also see .env.local
, .env.dev
or .env.production
. You can have several of these files as your project grows. Containing different information for the different environments your project runs in.
For the sake of simplicity, I am going to create a single new file at the root of my project called .env
Root Directory
| - node_modules/
| - src/
| - static/
| - .env <-- new file
| - .gitignore
| - gatsby-config.js
| - package-lock.json
| - package.json
Once we have this file created we want to configure Git to ignore this file. Again, we don't want these keys saved into our source control. Inside our .gitignore
, we will add a pattern that will ignore all .env
files. The * is sometimes referred to as a wildcard. Meaning this pattern will match anything that comes after the .env; .env.local
.env.dev
. Making sure that all environment variable files are ignored.
# ignore all environment variable files
.env*
If you're using the Gatbsy hello-world Starter, this pattern will already be in your .gitignore
Quick note. Sometimes we do want and even need certain .env
files to be saved in our source control since they contain data required for our project to function. Not all data in .env
files is sensitive data. For these scenarios you would simply modify the names of your files and update your .gitignore
to only ignore the files you want to be untracked by Git.
Saving your keys into your new .env file
Inside this new file we are going to store all our sensitive data by key value
pairs. What you save as the key is important as these will be the names we will later use to retrieve their corresponding values. I am calling mine keyID
and secretKey
. If you're using the same names, here is what your file should now look like.
keyID = your-id
secretKey = your-secret-key
Great. So how do we access these values?
In order to give gatsby-source-s3
access to our keys, we use a package that is already a dependency of Gatsby dotenv.
In order to utilize this package, we again need to first tell Gatsby to use it. This is again done by adding it to our gatsby-config.js
. However, we won't be adding it in the same way we did our gatsby-source-s3
plugin. Here we want to require
it at the very top of our file.
require('dotenv').config()
module.exports = {
plugins: [
{
resolve: 'gatsby-source-s3',
options: {
aws: {
// leave these empty for now
accessKeyId: '',
secretAccessKey: '',
},
buckets: ['bucket-name'],
},
},
],
}
If you named your file .env
like I did above, this is all you need to do. However, if you decided to name your file something else like, .env.aws
, .env.dev
or env.local
, (which is totally fine if you did) then you are going to have to do a little more work. You need to tell dotenv
to look for that specific file. You do that by setting the path parameter equal to the location and name of your file.
require('dotenv').config({
path: `your-file.env`,
})
Now that we have successfully required the dotenv
package and configured it to reference the file containing our keys, we can now supply our gatsby-source-s3
with those keys.
Going back to our gatsby-source-s3
settings, we can now access these keys by adding process.env.{your-key-name}
into the correct locations. This is where the name of those keys is important.
Your gatsby-config.js
should now look something like this
require('dotenv').config()
module.exports = {
plugins: [
{
resolve: 'gatsby-source-s3',
options: {
aws: {
accessKeyId: process.env.keyID,
secretAccessKey: process.env.secretKey,
},
buckets: ['bucket-name'],
},
},
],
}
What is process.env?
Quick detour..
process.env
is an object that contains information about our environment. If you want to visually see this (sometimes this helps to better understand what it is) add console.log(process.env)
to your gatsby-config.js
directly under where you required the dotenv
package and run gatsby develop
. You should see a ton of information get printed to your console. Including the keys we just defined our .env
.
By saying process.env
and appending .your-key-name
, cherry picks that specific set of data from the rest of your environment data and returns the value within it.
Back on track. Let's see this work
With everything installed and configured, you should now be able to run gatsby develop
and have gatsby-source-s3
access your S3 Bucket making all the assets in there available to you via GraphQL.
Once you've started your development server, go to http://localhost:8000/__graphql. You should now see 4 new values in your Explorer panel.
- allS3Image
- allS3Object
- s3Image
- s3Object
To have GraphQL return all of your assets run the following query. This will return all items found in your bucket. Images, folders, pdfs, etc.
Url
will return the url you can use to access the assetKey
will return the name of the asset
query S3Query {
allS3Object {
edges {
node {
Url
Key
}
}
}
}
Hopefully this returned some data for you. Here is an example of what that returned data could look like.
{
"data": {
"allS3Object": {
"edges": [
{
"node": {
"Url": "https://s3.amazonaws.com/bucket-name/name-of-file.png",
"Key": "name-of-file.png"
}
},
]
}
}
}
Need help?
If you run into any issues or questions please feel free to reach out to me via Twitter. I am more than happy to help.
Thanks!