Using core.async and Transducers to upload files from the browser to S3
In a project I'm working on we needed to enable users to upload
media content. In many scenarios it makes sense to upload to S3
directly from the browser instead of routing it through a
server. If you're hosting on Heroku you need to do this anyways.
After digging a bit into
core.async
this seemed like a neat little excuse to give Clojure's new
transducers a go.
The Problem
To upload files directly to S3 without any server in between you need to do a couple of things:
- Enable Cross-Origin Resource Sharing (CORS) on your bucket
- Provide special parameters in the request that authorize the upload
Enabling CORS is fairly straightforward, just follow the documentation provided by AWS. The aforementioned special parameters are based on your AWS credentials, the key you want to save the file to, it's content-type and a few other things. Because you don't want to store your credentials in client-side code the parameters need to be computed on a server.
We end up with the following procedure to upload a file to S3:
- Get a Javascript File object from the user
- Retrieve special parameters for post request from server
- Post directly from the browser to S3
Server-side code
I won't go into detail here, but here's some rough Clojure code illustrating the construction of the special parameters and how they're sent to the client.
Client-side: Transducers and core.async
As we see the process involves multiple asynchronous steps:
To wrap all that up into a useful minimal API that hides all the complex back and forth happening until a file is uploaded core.async channels and transducers turned out very useful:
(defn s3-upload [report-chan]
(let [upload-files (map #(upload-file % report-chan))
upload-chan (chan 10 upload-files)
sign-files (map #(sign-file % upload-chan))
signing-chan (chan 10 sign-files)]
(go (while true
(let [[v ch] (alts! [signing-chan upload-chan])]
; that's not really required but has been useful
(log v))))
signing-chan))
This function takes one channel as argument where it will
put!
the result of the S3 request. You can take a
look at the upload-file
and
sign-file
functions
in this gist.
So what's happening here? We use a channel for
each step of the process: signing-chan
and
upload-chan
. Both of those channels have an
associated transducer. In this case you can think best of a
transducer as a function that's applied to each item in a
channel on it's way through the channel. I initially trapped
upon the fact that the transducing function is only applied when
the element is being taken from the channel as well. Just
putting things into a channel doesn't trigger the execution of
the transducing function.
signing-chan
's transducer initiates the request to
sign the File object that has been put into the channel. The
second argument to the sign-file
function is a
channel where the AJAX callback will put it's result. Similary
upload-chan
's transducer initiates the upload to S3
based on information that has been put into the channel. A
callback will then put S3's response into the supplied
report-chan
.
The last line returns the channel that can be used to initiate a new upload.
Using this
Putting this into a library and opening it up for other people
to use isn't overly complicated, the exposed API is actually
very simple. Imagine an
Om component
upload-form
:
(defn queue-file [e owner {:keys [upload-queue]}]
(put! upload-queue (first (array-seq (.. e -target -files)))))
(defcomponent upload-form [text owner]
(init-state [_]
(let [rc (chan 10)]
{:upload-queue (s3-upload rc)
:report-chan rc}))
(did-mount [_]
(let [{:keys [report-chan]} (om/get-state owner)]
(go (while true (log (<! report-chan))))))
(render-state [this state]
(dom/form
(dom/input {:type "file" :name "file"
:on-change #(queue-file % owner state)} nil))))
I really like how simple this is. You put a file into a channel
and whenever it's done you take the result from another channel.
s3-upload
could take additional options like
logging functions or a custom URL to retrieve the special
parameters required to authorize the request to S3.
This has been the first time I've been doing something useful
with core.async and, probably less surprisingly, the first time
I played with transducers. I assume many things can be done
better and I still need to look into some things like how to
properly shut down the go
blocks.
Any feedback is welcome!
Tweet or
mail me!
Thanks to Dave Liepmann who let me peek into
some code he wrote that did similar things and to Kevin Downey
(hiredman) who helped me understand core.async and
transducers by answering my stupid questions in
#clojure
on Freenode.