What's inside V8 javascript engine

Photo by Tim Mossholder

Javascript engine

Javascript is an interpreted language. Meaning, it need not to be compiled before giving to processor (CPU) for execution, unlike C/C++ does. But javascript have to be passed into some medium that converts the source code to machine code, so that our CPU can understand and process it. This task is performed by Javascript engine.

Javascript engine is not built inside our processor, it comes with every browser. Yes, every browser will have their own engine to process our code. And so the way in which they process our source code will vary. Most commonly known javascript engines are:

Shoutout to an interesting repo called jsvu, which install multiple javascript engines in our machine and enables us to run our javascript file in different javascript engines (Demo).

Chrome's V8

In this article we are deep dive into the core parts of Chrome's V8 engine. V8 is written in C++ and it is a part of Google's chrome browser. The work of V8 engine is to take our javascript source code and compile to machine code with some optimisation on top of it. If a javascript engine is getting popular, which means it optimises javascript code better.

Wait. But V8 itself is written in C++, does it need itself to be compiled into machine code before executing our javascript?

Yes! 🤯

V8's C++ code

Before V8 takes our javascript source code to compile, V8 by itself have to be compiled to machine code and have to be placed inside the processor. That machine code in turn will take our javascript code and it gets compiled. This is said as JIT(Just-In-Time) compilation.

Javascript is just a scripting language, its single threaded behaviour or synchronous behaviour is given by our javascript engine. So, javascript can be said as Dynamically typed language which gets compiled as single threaded and synchronous manner, by our javascript engine.

How javascript code is processed in V8

workflow image of javascript inside V8

Now we look deeper into every block in this image.

V8's Javascript Parser

v8 javascript engine parser and pre-parser

Parser is a program in V8, which takes our javascript code and convert into AST (Abstract syntax tree). Parsing is a heavy time consumption process, it takes around 10-30% of the V8 time.

Parser in V8 is made of two parts, pre-parser and parser.

Parser is responsible to parse the code which have to run now, example: IIFE functions. And hence it is said as Eager parsing. And other than that its responsibility is extended for building AST, building variable scopes. It finds all syntax error as well. Because of this having these many responsibilities, it is much slower process.

On the other hand, pre-parser is 2X faster than the parser. It is used for skipping over functions, meaning the functions those no need run now (functions other than IIFEs). It is used only to find the start and end of the declared functions.

When the function is called, it is then eager parsed and compiled.

1// Top level code: 🚀 eager
2let name = 'Peter Parker';
3
4// IIFE: 🚀 eager
5(function isSpiderMan() {
6 const answer = name === 'Peter Parker' ? 'Yes' : 'No'
7 console.log(`${answer}`)
8})();
9
10// Top level function, but not IIFE: 🐌 lazy
11function isSpiderManAgain(personName) {
12 const answer = personName === 'Peter Parker' ? 'Yes' : 'No'
13 return answer;
14}
15
16// at the invocation call: 🚀 eager
17isSpiderManAgain(name);

In this snippet, at line 11 isSpiderManAgain function is lazy parsed by pre-parser, and at the time of calling the function at line 17, the same function is eager parsed in parser. And pre-parser is the place where closure is created.

Ignition, The Interpreter

The resulted AST from parser is then given to interpreter. The job of interpreter is to convert the AST into Bytecode. Bytecodes as small building blocks, joined together to form javascript functionality. There are bytecodes for operators like Add or TypeOf. All available bytecodes for V8 is here.

Later this bytecode will be compiled to machine code. This process will be easier if the generated bytecode was designed with same computational model as the physical CPU.

Bytecode is an abstraction of machine code.

V8 bytecode

This conversion to bytecode is handled by BytecodeGenerator. Later the generated bytecode is then passed to BytecodeInterpreter, where the bytecode is executed by BytecodeHandler. As a result, run-once or non-hot code (the code that will get executed only once) are stored more compactly in bytecode form. This conversion will reduce memory usage.

Turbofan, The Compiler

The Profiler will continuously observe the interpreter, if any code is running many times that will be given to Turbofan compiler.

Turbofan compiler takes the hot functions, and it will execute repeatedly to get their type (parameters) and it will store the type in register. If the next time the same function is called with same type, then the value from register is supplied rather running the complete process again.

If the type changes, it decompiles the code and it will run the function again to calculate the value. This process is called Inline caching. It also uses another technique called Hidden classes, to optimise the code of object creation.

Bytecode handlers are written in high level, machine architecture agnostic form of assembly code, and compiled by Turbofan. Here turbofan is used to build ignition and not to run it. - Ignition Design Doc

Orinoco, The Garbage collector (GC)

We allocate memory everytime we create an object. Those memory needs to be free'd up when the object in the memory is no longer used. We don't have unlimited memory.

The process clearing memory which is no longer reachable within the runtime is Garbage collection. This is acheived by mark and sweep technique.

The essential tasks that GC has to perform periodically:

  1. Identifying unreachable objects (objects not in use)
  2. Reuse the memory occupied by dead objects
  3. Defragment memory

To perform these tasks, Javascript execution will be stopped and these tasks will get executed in the main thread. If GC takes much time clearing the memory, then few frames have to be skipped while rendering the page. In order to avoid this, V8's Garbage Collector named Orinoco, make use of the latest and greatest parallel, incremental and concurrent techniques for garbage collection, in order to free the main thread. Most of the GC work is performed as background tasks, which makes main thread available.

V8 not only have cool name for their departments, also they have cool logo for every of those.

Different V8 departments

Conclusion

If we know how javascript is compiled at runtime, more we write better performant code. In this blog I have just touched the surface of V8. Hope it might have raised an interest to learn more about V8 and it's internals.

References

05/28/2020
All posts
Built with ❤️ and  Gatsby