Chapter 3: Asynchronous Programming
Terms defined: call stack, character encoding, class, constructor, event loop, exception, fluent interface, method, method chaining, non-blocking execution, promise, promisification, protocol, UTF-8
Callbacks work,
but they are hard to read and debug,
which means they only “work” in a limited sense.
JavaScript’s developers added promises to the language in 2015
to make callbacks easier to write and understand,
and more recently they added the keywords async
and await
as well
to make asynchronous programming easier still.
To show how these work,
we will create a class of our own called Pledge
that provides the same core features as promises.
Our explanation was inspired by Trey Huffine’s tutorial,
and we encourage you to read that as well.
Section 3.1: How can we manage asynchronous execution?
JavaScript is built around an event loop. Every task is represented by an entry in a queue; the event loop repeatedly takes a task from the front of the queue, runs it, and adds any new tasks that it creates to the back of the queue to run later. Only one task runs at a time; each has its own call stack, but objects can be shared between tasks (Figure 3.1).
Most tasks execute all the code available in the order it is written.
For example,
this one-line program uses Array.forEach
to print each element of an array in turn:
[1000, 1500, 500].forEach(t => console.log(t))
1000
1500
500
However,
a handful of special built-in functions make Node switch tasks
or add new tasks to the run queue.
For example,
setTimeout
tells Node to run a callback function
after a certain number of milliseconds have passed.
Its first argument is a callback function that takes no arguments,
and its second is the delay.
When setTimeout
is called,
Node sets the callback aside for the requested length of time,
then adds it to the run queue.
(This means the task runs at least the specified number of milliseconds later.)
Why zero arguments?
setTimeout
‘s requirement that callback functions take no arguments
is another example of a protocol.
One way to think about it is that protocols allow old code to use new code:
whoever wrote setTimeout
couldn’t know what specific tasks we want to delay,
so they specified a way to wrap up any task at all.
As the listing below shows, the original task can generate many new tasks before it completes, and those tasks can run in a different order than the order in which they were created (Figure 3.2).
[1000, 1500, 500].forEach(t => {
console.log(`about to setTimeout for ${t}`)
setTimeout(() => console.log(`inside timer handler for ${t}`), t)
})
about to setTimeout for 1000
about to setTimeout for 1500
about to setTimeout for 500
inside timer handler for 500
inside timer handler for 1000
inside timer handler for 1500
setTimeout
to delay operations.If we give setTimeout
a delay of zero milliseconds,
the new task can be run right away,
but any other tasks that are waiting have a chance to run as well:
[1000, 1500, 500].forEach(t => {
console.log(`about to setTimeout for ${t}`)
setTimeout(() => console.log(`inside timer handler for ${t}`), 0)
})
about to setTimeout for 1000
about to setTimeout for 1500
about to setTimeout for 500
inside timer handler for 1000
inside timer handler for 1500
inside timer handler for 500
We can use this trick to build a generic non-blocking function that takes a callback defining a task and switches tasks if any others are available:
const nonBlocking = (callback) => {
setTimeout(callback, 0)
}
[1000, 1500, 500].forEach(t => {
console.log(`about to do nonBlocking for ${t}`)
nonBlocking(() => console.log(`inside timer handler for ${t}`))
})
about to do nonBlocking for 1000
about to do nonBlocking for 1500
about to do nonBlocking for 500
inside timer handler for 1000
inside timer handler for 1500
inside timer handler for 500
Node’s built-in function setImmediate
does exactly what our nonBlocking
function does:
Node also has process.nextTick
,
which doesn’t do quite the same thing—we’ll explore the differences in the exercises.
[1000, 1500, 500].forEach(t => {
console.log(`about to do setImmediate for ${t}`)
setImmediate(() => console.log(`inside immediate handler for ${t}`))
})
about to do setImmediate for 1000
about to do setImmediate for 1500
about to do setImmediate for 500
inside immediate handler for 1000
inside immediate handler for 1500
inside immediate handler for 500
Section 3.2: How do promises work?
Before we start building our own promises, let’s look at how we want them to work:
import Pledge from './pledge.js'
new Pledge((resolve, reject) => {
console.log('top of a single then clause')
setTimeout(() => {
console.log('about to call resolve callback')
resolve('this is the result')
}, 0)
}).then((value) => {
console.log(`in 'then' with "${value}"`)
return 'first then value'
})
top of a single then clause
about to call resolve callback
in 'then' with "this is the result"
This short program creates a new Pledge
with a callback that takes two other callbacks as arguments:
resolve
(which will run when everything worked)
and reject
(which will run when something went wrong).
The top-level callback does the first part of what we want to do,
i.e.,
whatever we want to run before we expect a delay;
for demonstration purposes, we will use setTimeout
with zero delay to switch tasks.
Once this task resumes,
we call the resolve
callback to trigger whatever is supposed to happen after the delay.
Now look at the line with then
.
This is a method of the Pledge
object we just created,
and its job is to do whatever we want to do after the delay.
The argument to then
is yet another callback function;
it will get the value passed to resolve
,
which is how the first part of the action communicates with the second
(Figure 3.3).
In order to make this work,
Pledge
‘s constructor must take a single function called action
.
This function must take two callbacks as arguments:
what to do if the action completes successfully
and what to do if it doesn’t (i.e., how to handle errors).
Pledge
will provide these callbacks to the action at the right times.
Pledge
also needs two methods:
then
to enable more actions
and catch
to handle errors.
To simplify things just a little bit,
we will allow users to chain as many then
s as they want,
but only allow one catch
.
Section 3.3: How can we chain operations together?
A fluent interface
is a style of object-oriented programming
in which the methods of an object return this
so that method calls can be chained together.
For example,
if our class is:
class Fluent {
constructor () {...}
first (top) {
...do something with top...
return this
}
second (left, right) {
...do something with left and right...
}
}
then we can write:
const f = new Fluent()
f.first('hello').second('and', 'goodbye')
or even
(new Fluent()).first('hello').second('and', 'goodbye')
Array
‘s fluent interface lets us write expressions like
Array.filter(...).map(...)
that are usually more readable than assigning intermediate results to temporary variables.
If the original action given to our Pledge
completes successfully,
the Pledge
gives us a value by calling the resolve
callback.
We pass this value to the first then
,
pass the result of that then
to the second one,
and so on.
If any of them fail and throw an exception,
we pass that exception to the error handler.
Putting it all together,
the whole class looks like this:
class Pledge {
constructor (action) {
this.actionCallbacks = []
this.errorCallback = () => {}
action(this.onResolve.bind(this), this.onReject.bind(this))
}
then (thenHandler) {
this.actionCallbacks.push(thenHandler)
return this
}
catch (errorHandler) {
this.errorCallback = errorHandler
return this
}
onResolve (value) {
let storedValue = value
try {
this.actionCallbacks.forEach((action) => {
storedValue = action(storedValue)
})
} catch (err) {
this.actionCallbacks = []
this.onReject(err)
}
}
onReject (err) {
this.errorCallback(err)
}
}
export default Pledge
Binding this
Pledge
‘s constructor makes two calls to a special function called bind
.
When we create an object obj
and call a method meth
,
JavaScript sets the special variable this
to obj
inside meth
.
If we use a method as a callback,
though,
this
isn’t automatically set to the correct object.
To convert the method to a plain old function with the right this
,
we have to use bind
.
The documentation has more details and examples.
Let’s create a Pledge
and return a value:
import Pledge from './pledge.js'
new Pledge((resolve, reject) => {
console.log('top of a single then clause')
}).then((value) => {
console.log(`then with "${value}"`)
return 'first then value'
})
top of a single then clause
Why didn’t this work?
-
We can’t use
return
with pledges because the call stack of the task that created the pledge is gone by the time the pledge executes. Instead, we must callresolve
orreject
. -
We haven’t done anything that defers execution, i.e., there is no call to
setTimeout
,setImmediate
, or anything else that would switch tasks. Our original motivating example got this right.
This example shows how we can chain actions together:
import Pledge from './pledge.js'
new Pledge((resolve, reject) => {
console.log('top of action callback with double then and a catch')
setTimeout(() => {
console.log('about to call resolve callback')
resolve('initial result')
console.log('after resolve callback')
}, 0)
console.log('end of action callback')
}).then((value) => {
console.log(`first then with "${value}"`)
return 'first value'
}).then((value) => {
console.log(`second then with "${value}"`)
return 'second value'
})
top of action callback with double then and a catch
end of action callback
about to call resolve callback
first then with "initial result"
second then with "first value"
after resolve callback
Notice that inside each then
we do use return
because these clauses all run in a single task.
As we will see in the next section,
the full implementation of Promise
allows us to run both normal code
and delayed tasks inside then
handlers.
Finally,
in this example we explicitly signal a problem by calling reject
to make sure our error handling does what it’s supposed to:
import Pledge from './pledge.js'
new Pledge((resolve, reject) => {
console.log('top of action callback with deliberate error')
setTimeout(() => {
console.log('about to reject on purpose')
reject('error on purpose')
}, 0)
}).then((value) => {
console.log(`should not be here with "${value}"`)
}).catch((err) => {
console.log(`in error handler with "${err}"`)
})
top of action callback with deliberate error
about to reject on purpose
in error handler with "error on purpose"
Section 3.4: How are real promises different?
Let’s rewrite our chained pledge with built-in promises:
new Promise((resolve, reject) => {
console.log('top of action callback with double then and a catch')
setTimeout(() => {
console.log('about to call resolve callback')
resolve('initial result')
console.log('after resolve callback')
}, 0)
console.log('end of action callback')
}).then((value) => {
console.log(`first then with "${value}"`)
return 'first value'
}).then((value) => {
console.log(`second then with "${value}"`)
return 'second value'
})
top of action callback with double then and a catch
end of action callback
about to call resolve callback
after resolve callback
first then with "initial result"
second then with "first value"
It looks almost the same,
but if we read the output carefully
we can see that the callbacks run after the main program finishes.
This is a signal that Node is delaying the execution of the code in the then
handler.
A very common pattern is to return another promise from inside then
so that the next then
is called on the returned promise,
not on the original promise
(Figure 3.4).
This is another way to implement a fluent interface:
if a method of one object returns a second object,
we can call a method of the second object immediately.
const delay = (message) => {
return new Promise((resolve, reject) => {
console.log(`constructing promise: ${message}`)
setTimeout(() => {
resolve(`resolving: ${message}`)
}, 1)
})
}
console.log('before')
delay('outer delay')
.then((value) => {
console.log(`first then: ${value}`)
return delay('inner delay')
}).then((value) => {
console.log(`second then: ${value}`)
})
console.log('after')
before
constructing promise: outer delay
after
first then: resolving: outer delay
constructing promise: inner delay
second then: resolving: inner delay
We therefore have three rules for chaining promises:
-
If our code can run synchronously, just put it in
then
. -
If we want to use our own asynchronous function, it must create and return a promise.
-
Finally, if we want to use a library function that relies on callbacks, we have to convert it to use promises. Doing this is called promisification (because programmers will rarely pass up an opportunity to add a bit of jargon to the world), and most functions in Node have already been promisified.
Section 3.5: How can we build tools with promises?
Promises may seem more complex than callbacks right now,
but that’s because we’re looking at how they work rather than at how to use them.
To explore the latter subject,
let’s use promises to build a program to count the number of lines in a set of files.
A few moments of search on NPM turns up a promisified version of fs-extra
called fs-extra-promise
,
so we will rely on it for file operations.
Our first step is to count the lines in a single file:
import fs from 'fs-extra-promise'
const filename = process.argv[2]
fs.readFileAsync(filename, { encoding: 'utf-8' })
.then(data => {
const length = data.split('\n').length - 1
console.log(`${filename}: ${length}`)
})
.catch(err => {
console.error(err.message)
})
node count-lines-single-file.js count-lines-single-file.js
count-lines-single-file.js: 12
Character encoding
A character encoding
specifies how characters are stored as bytes.
The most widely used is UTF-8,
which stores characters common in Western European languages in a single byte
and uses multi-byte sequences for other symbols.
If we don’t specify a character encoding,
fs.readFileAsync
gives us an array of bytes rather than a string of characters.
We can tell we’ve made this mistake when we try to call a method of String
and Node tells us we can’t.
The next step is to count the lines in multiple files.
We can use glob-promise
to delay handling the output of glob
,
but we need some way to create a separate task to count the lines in each file
and to wait until those line counts are available before exiting our program.
The tool we want is Promise.all
,
which waits until all of the promises in an array have completed.
To make our program a little more readable,
we will put the creation of the promise for each file in a separate function:
import glob from 'glob-promise'
import fs from 'fs-extra-promise'
const main = (srcDir) => {
glob(`${srcDir}/**/*.*`)
.then(files => Promise.all(files.map(f => lineCount(f))))
.then(counts => counts.forEach(c => console.log(c)))
.catch(err => console.log(err.message))
}
const lineCount = (filename) => {
return new Promise((resolve, reject) => {
fs.readFileAsync(filename, { encoding: 'utf-8' })
.then(data => resolve(data.split('\n').length - 1))
.catch(err => reject(err))
})
}
const srcDir = process.argv[2]
main(srcDir)
node count-lines-globbed-files.js .
10
1
12
4
1
...
3
2
5
2
14
However,
we want to display the names of the files whose lines we’re counting along with the counts.
To do this our then
must return two values.
We could put them in an array,
but it’s better practice to construct a temporary object with named fields
(Figure 3.5).
This approach allows us to add or rearrange fields without breaking code
and also serves as a bit of documentation.
With this change
our line-counting program becomes:
import glob from 'glob-promise'
import fs from 'fs-extra-promise'
const main = (srcDir) => {
glob(`${srcDir}/**/*.*`)
.then(files => Promise.all(files.map(f => lineCount(f))))
.then(counts => counts.forEach(
c => console.log(`${c.lines}: ${c.name}`)))
.catch(err => console.log(err.message))
}
const lineCount = (filename) => {
return new Promise((resolve, reject) => {
fs.readFileAsync(filename, { encoding: 'utf-8' })
.then(data => resolve({
name: filename,
lines: data.split('\n').length - 1
}))
.catch(err => reject(err))
})
}
const srcDir = process.argv[2]
main(srcDir)
As in Chapter 2,
this works until we run into a directory whose name name matches *.*
,
which we do when counting the lines in the contents of node_modules
.
The solution once again is to use stat
to check if something is a file or not
before trying to read it.
And since stat
returns an object that doesn’t include the file’s name,
we create another temporary object to pass information down the chain of then
s.
import glob from 'glob-promise'
import fs from 'fs-extra-promise'
const main = (srcDir) => {
glob(`${srcDir}/**/*.*`)
.then(files => Promise.all(files.map(f => statPair(f))))
.then(files => files.filter(pair => pair.stats.isFile()))
.then(files => files.map(pair => pair.filename))
.then(files => Promise.all(files.map(f => lineCount(f))))
.then(counts => counts.forEach(
c => console.log(`${c.lines}: ${c.name}`)))
.catch(err => console.log(err.message))
}
const statPair = (filename) => {
return new Promise((resolve, reject) => {
fs.statAsync(filename)
.then(stats => resolve({ filename, stats }))
.catch(err => reject(err))
})
}
const lineCount = (filename) => {
return new Promise((resolve, reject) => {
fs.readFileAsync(filename, { encoding: 'utf-8' })
.then(data => resolve({
name: filename,
lines: data.split('\n').length - 1
}))
.catch(err => reject(err))
})
}
const srcDir = process.argv[2]
main(srcDir)
node count-lines-with-stat.js .
10: ./assign-immediately.js
1: ./assign-immediately.out
12: ./await-fs.js
4: ./await-fs.out
1: ./await-fs.sh
...
3: ./x-multiple-catch/example.js
2: ./x-multiple-catch/example.txt
5: ./x-trace-load.md
2: ./x-trace-load/config.yml
14: ./x-trace-load/example.js
This code is complex, but much simpler than it would be if we were using callbacks.
Lining things up
This code uses the expression {filename, stats}
to create an object whose keys are filename
and stats
,
and whose values are the values of the corresponding variables.
Doing this makes the code easier to read,
both because it’s shorter
but also because it signals that the value associated with the key filename
is exactly the value of the variable with the same name.
Section 3.6: How can we make this more readable?
Promises eliminate the deep nesting associated with callbacks of callbacks,
but they are still hard to follow.
The latest versions of JavaScript provide two new keywords async
and await
to flatten code further.
async
means “this function implicitly returns a promise”,
while await
means “wait for a promise to resolve”.
This short program uses both keywords to print the first ten characters of a file:
import fs from 'fs-extra-promise'
const firstTenCharacters = async (filename) => {
const text = await fs.readFileAsync(filename, 'utf-8')
console.log(`inside, raw text is ${text.length} characters long`)
return text.slice(0, 10)
}
console.log('about to call')
const result = firstTenCharacters(process.argv[2])
console.log(`function result has type ${result.constructor.name}`)
result.then(value => console.log(`outside, final result is "${value}"`))
about to call
function result has type Promise
inside, raw text is 24 characters long
outside, final result is "Begin at t"
Translating code
When Node sees
await
andasync
it silently converts the code to use promises withthen
,resolve
, andreject
; we will see how this works in Chapter 15. In order to provide a context for this transformation we must putawait
inside a function that is declared to beasync
: we can’t simply writeawait fs.statAsync(...)
at the top level of our program outside a function. This requirement is occasionally annoying, but since we should be putting our code in functions anyway it’s hard to complain.
To see how much cleaner our code is with await
and async
,
let’s rewrite our line counting program to use them.
First,
we modify the two helper functions to look like they’re waiting for results and returning them.
They actually wrap their results in promises and return those,
but Node now takes care of that for us:
const statPair = async (filename) => {
const stats = await fs.statAsync(filename)
return { filename, stats }
}
const lineCount = async (filename) => {
const data = await fs.readFileAsync(filename, 'utf-8')
return {
filename,
lines: data.split('\n').length - 1
}
}
Next,
we modify main
to wait for things to complete.
We must still use Promise.all
to handle the promises
that are counting lines for individual files,
but the result is less cluttered than our previous version.
const main = async (srcDir) => {
const files = await glob(`${srcDir}/**/*.*`)
const pairs = await Promise.all(
files.map(async filename => await statPair(filename))
)
const filtered = pairs
.filter(pair => pair.stats.isFile())
.map(pair => pair.filename)
const counts = await Promise.all(
filtered.map(async name => await lineCount(name))
)
counts.forEach(
({ filename, lines }) => console.log(`${lines}: ${filename}`)
)
}
const srcDir = process.argv[2]
main(srcDir)
Section 3.7: How can we handle errors with asynchronous code?
We created several intermediate variables in the line-counting program to make the steps clearer. Doing this also helps with error handling; to see how, we will build up an example in stages.
First,
if we return a promise that fails without using await
,
then our main function will finish running before the error occurs,
and our try
/catch
doesn’t help us
(Figure 3.6):
async function returnImmediately () {
try {
return Promise.reject(new Error('deliberate'))
} catch (err) {
console.log('caught exception')
}
}
returnImmediately()
/u/stjs/async-programming/return-immediately.js:3
One solution to this problem is to be consistent and always return something.
Because the function is declared async
,
the Error
in the code below is automatically wrapped in a promise
so we can use .then
and .catch
to handle it as before:
async function returnImmediately () {
try {
return Promise.reject(new Error('deliberate'))
} catch (err) {
return new Error('caught exception')
}
}
const result = returnImmediately()
result.catch(err => console.log(`caller caught ${err}`))
caller caught Error: deliberate
If instead we return await
,
the function waits until the promise runs before returning.
The promise is turned into an exception because it failed,
and since we’re inside the scope of our try
/catch
block,
everything works as we want:
async function returnAwait () {
try {
return await Promise.reject(new Error('deliberate'))
} catch (err) {
console.log('caught exception')
}
}
returnAwait()
caught exception
We prefer the second approach, but whichever you choose, please be consistent.
Section 3.8: Exercises
Immediate versus next tick
What is the difference between setImmediate
and process.nextTick
?
When would you use each one?
Tracing promise execution
-
What does this code print and why?
Promise.resolve('hello')
-
What does this code print and why?
Promise.resolve('hello').then(result => console.log(result))
-
What does this code print and why?
const p = new Promise((resolve, reject) => resolve('hello')) .then(result => console.log(result))
Hint: try each snippet of code interactively in the Node interpreter and as a command-line script.
Multiple catches
Suppose we create a promise that deliberately fails and then add two error handlers:
const oops = new Promise((resolve, reject) => reject(new Error('failure')))
oops.catch(err => console.log(err.message))
oops.catch(err => console.log(err.message))
When the code is run it produces:
failure
failure
- Trace the order of operations: what is created and when is it executed?
- What happens if we run these same lines interactively? Why do we see something different than what we see when we run this file from the command line?
Then after catch
Suppose we create a promise that deliberately fails
and attach both then
and catch
to it:
new Promise((resolve, reject) => reject(new Error('failure')))
.catch(err => console.log(err))
.then(err => console.log(err))
When the code is run it produces:
Error: failure
at /u/stjs/promises/catch-then/example.js:1:41
at new Promise (<anonymous>)
at Object.<anonymous> (/u/stjs/promises/catch-then/example.js:1:1)
at Module._compile (internal/modules/cjs/loader.js:1151:30)
at Object.Module._extensions..js \
(internal/modules/cjs/loader.js:1171:10)
at Module.load (internal/modules/cjs/loader.js:1000:32)
at Function.Module._load (internal/modules/cjs/loader.js:899:14)
at Function.executeUserEntryPoint [as runMain] \
(internal/modules/run_main.js:71:12)
at internal/main/run_main_module.js:17:47
undefined
- Trace the order of execution.
- Why is
undefined
printed at the end?
Head and tail
The Unix head
command shows the first few lines of one or more files,
while the tail
command shows the last few.
Write programs head.js
and tail.js
that do the same things using promises and async
/await
,
so that:
node head.js 5 first.txt second.txt third.txt
prints the first five lines of each of the three files and:
node tail.js 5 first.txt second.txt third.txt
prints the last five lines of each file.
Histogram of line counts
Extend count-lines-with-stat-async.js
to create a program lh.js
that prints two columns of output:
the number of lines in one or more files
and the number of files that are that long.
For example,
if we run:
node lh.js promises/*.*
the output might be:
Length | Number of Files |
---|---|
1 | 7 |
3 | 3 |
4 | 3 |
6 | 7 |
8 | 2 |
12 | 2 |
13 | 1 |
15 | 1 |
17 | 2 |
20 | 1 |
24 | 1 |
35 | 2 |
37 | 3 |
38 | 1 |
171 | 1 |
Select matching lines
Using async
and await
,
write a program called match.js
that finds and prints lines containing a given string.
For example:
node match.js Toronto first.txt second.txt third.txt
would print all of the lines from the three files that contain the word “Toronto”.
Find lines in all files
Using async
and await
,
write a program called in-all.js
that finds and prints lines found in all of its input files.
For example:
node in-all.js first.txt second.txt third.txt
will print those lines that occur in all three files.
Find differences between two files
Using async
and await
,
write a program called file-diff.js
that compares the lines in two files
and shows which ones are only in the first file,
which are only in the second,
and which are in both.
For example,
if left.txt
contains:
some
people
and right.txt
contains:
write
some
code
then:
node file-diff.js left.txt right.txt
would print:
2 code
1 people
* some
2 write
where 1
, 2
, and *
show whether lines are in only the first or second file
or are in both.
Note that the order of the lines in the file doesn’t matter.
Hint: you may want to use the Set
class to store lines.
Trace file loading
Suppose we are loading a YAML configuration file
using the promisified version of the fs
library.
In what order do the print statements in this test program appear and why?
import fs from 'fs-extra-promise'
import yaml from 'js-yaml'
const test = async () => {
const raw = await fs.readFileAsync('config.yml', 'utf-8')
console.log('inside test, raw text', raw)
const cooked = yaml.safeLoad(raw)
console.log('inside test, cooked configuration', cooked)
return cooked
}
const result = test()
console.log('outside test, result is', result.constructor.name)
result.then(something => console.log('outside test we have', something))
Any and all
-
Add a method
Pledge.any
that takes an array of pledges and as soon as one of the pledges in the array resolves, returns a single promise that resolves with the value from that pledge. -
Add another method
Pledge.all
that takes an array of pledges and returns a single promise that resolves to an array containing the final values of all of those pledges.
This article may be helpful.