[squirrelfish] JavaScriptCore renames
Maciej Stachowiak
mjs at apple.com
Sat Nov 15 13:23:57 PST 2008
On Nov 15, 2008, at 12:07 PM, Geoffrey Garen wrote:
>> That being said, an overview of the renames planned would be helpful.
>
> Okeedokee.
>
> After much discussion, we concluded that almost all names in the
> field of virtual machines are potentially problematic, because each
> name can mean different things in different contexts, and each name
> usually fails to say precisely what it means. For example, "JIT",
> which technically just stands for "Just In Time", could refer to any
> of a million different processes on a computer, including the "just
> in time" initialization of a data member in a class. Even if we
> allow that "JIT" implicitly refers to CPU-specific translation at
> runtime, making it less vague, "just in time" is still an inaccurate
> phrase, since function-at-a-time translation will translate unlikely
> basic blocks well before they execute -- if they execute at all.
> Even the phrase "virtual machine" is problematic, since the CPU
> itself is a kind of virtual machine. It's turtles all the way down.
>
> However, none of this changes the fact that we use certain names as
> terms of art all the time. So, we decided to shun the siren song of
> perfect, technically accurate naming, and settle on names that
> reflected our basic everyday thinking and speech.
>
> Therefore:
>
> * "Bytecode"
>
> Anything in JavaScript currently called "byte code" "opcode" "op
> code" "code" "bitcode" etc. we'll rename to "bytecode". We use one
> word with no camel case to indicate that the word is just jargon,
> and not a true compound word. It's not really one byte, but we're
> over that. We use this word every day when talking about our code,
> so we'll use it in our code, too.
This sounds good in general, although as I pointed out on IRC and will
repeat for benefit of others here, I don't think "bytecode" and
"opcode" are really synonyms. My understanding of the word would be:
- bytecode is a mass noun, like "machine code" - you can't have "a
bytecode"
- a singular unit of bytecode is an "instruction", or "bytecode
instruction" if you must
- the part of the instruction that says operation to perform, rather
than what the operands are, is an "opcode"
>
> * "BytecodeGenerator"
>
> The class used to generate bytecode. "Generator" clearly indicates
> that the class outputs bytecode, whereas a name like "compiler"
> might mean that the class outputs bytecode, or it might mean that
> the class takes bytecode as its input. Also, we thought that names
> like "compiler" implied a larger suite of tools not included in this
> class.
I do believe that the term Bytecompiler specifically refers to a
compiler that outputs bytecode, and can never refer to a compiler that
takes bytecode as input instead. It's a little shorter. But
technically our bytecompiler encompasses not just the
BytecodeGenerator class but also all the emit functions in Nodes.cpp,
so I'd probably use it for a directory, not a class.
>
>
> * "BytecodeInterpreter"
>
> The class that executes a program in bytecode form. We liked the
> symmetry with "BytecodeGenerator". We rejected names like BytecodeVM
> because we thought the name "virtual machine" was a little too
> vague, and it implied a larger suite of functionality not limited to
> this class.
>
> * "JIT"
>
> The class that translates a program in bytecode form to CPU-specific
> code. We rejected "BytecodeJIT" because we couldn't tell if a
> BytecodeJIT had bytecode as its input or its output. It's not
> symmetric with BytecodeInterpreter, but oh well. We liked "JIT"
> because we thought that interpreter vs JIT was a widely used and
> understood dichotomy.
I like all of these.
> So we have this directory structure:
>
> bytecode
> -> generator
> -> interpreter
> -> jit
> -> sampler
I'm not sure I like having a lot of subdirectories under bytecode
though, particularly since they will each contain so few files. I'd
propose:
- bytecodegenerator or bytecompiler at top level (Bytecompiler is a
slightly more concise term of art for a compiler that outputs
bytecode, with no ambiguity about whether the bytecode is going in or
out)
- a bytecode directory at top level containing general bytecode data
structures and the bytecode interpreter
- a jit directory at top level
- sampler stuff relegated to one of the above
That's more in line with the directory structure we all discussed
before, and which we've barely had a chance to get used to.
> It bears mentioning that JavaScriptCore also contains a bytecode and
> a JIT for regular expressions, so the names above might be vague. We
> decided that the best solution was to treat regular expression
> functionality as secondary, giving classes related to it an extra
> prefix, or a different namespace. We also decided not to bother
> changing the nomenclature in PCRE, because PCRE is frozen in time.
> Also, PCRE hurts me in the brain.
>
> Many other small renames and file splittings are included in my
> patch, but they all tend to follow from these. The only substantial
> one I can think of is "__". Right now we have code like this:
>
> m_jit.movl_i32r(...)
>
> In this context, "m_jit" is a data member of "JIT", and an instance
> of the "X86Assembler" class. JIT::m_jit is weird, and calling an
> assembler a JIT is also weird. So, we opted to renamed "m_jit" to
> "m_assembler", and then for brevity, use a macro to replace
> "m_assembler." with "__", so you get this:
>
> __ movl_i32r(...)
>
> There are a few cases where this looks a little weird right now, but
> they're fixable. In general, this approach has worked well for other
> projects, so it will probably work well for us.
I'm not a huge fan of this but I must admit m_jit would be wrong and
saying m_assembler all the time would strain readability.
Regards,
Maciej
More information about the squirrelfish-dev
mailing list