An artifact is a term used in digital forensics to refer to any trace left on a system by an adversary. Examples are files, registry keys and event logs.
> Its also anything produced by a artistic production process e.g. in software- aka build-artifacts or documentation.
You can drop "artistic" from that, as it comes with a connotation that doesn't necessarily apply. The first part of the term comes from the more general meaning of ars/art, which would nearly translate into English as "craft".
If you follow the (Latin) origin it is even wider, in Italian artifact is artefatto where while the art (arte) part is as you say, the fatto comes from fare (Latin facere) which translate to "made".
And we say "fatto ad arte" to mean that it is "intentionally made" i.e. artefatto is something that doesn't happen normally or naturally and/or does not exist in nature.
I've used Google Vision API for a wide variety of Arabic fonts and it has worked pretty well with recognising ligatures but not diacritics as it either doesn't recognise them or adds non-existent ones.
Whilst learning classical Arabic it was a pain when reading to search through the Hans Wehr dictionary by first extracting the root letters of the word and then looking up the root and then finding the correct form.
So I put together a frontend [1] to search a scanned version I found of it. It has now evolved in to an offline PWA, which you can search from Arabic to English and English to Arabic, convert numbers in both classical or modern Arabic form, stem words etc. When searching in Arabic it removes affixes, prefixes, suffixes and also uses multiple stemming algorithms to obtain the root if an exact match is not found.
It's pretty neat and makes reading so much easier no matter what device I'm on. It's also used by other professors, teachers, students and friends who are also in the classical Arabic field.
Well you've found an عيب in the dictionary! Currently it uses a single dictionary, Hans Wehr, which doesn't contain that word (I've checked in my physical copy also).
In my backlog is adding additional dictionaries to fill in where Hans Wher is lacking, with next in line Lane's Lexicon. Hans Wehr is a good dictionary for students and contains approximately 90% of the words they will encounter in their first few years of studying. The issue I'm having at the moment is that both dictionaries are structured differently and it requires some manual work to unify their structures.
When searching a verb you can either search the root letters, or any other form even with affixes attached and the stemming algorithms should get the right root word.
Great. Good luck with adding new dictionaries and thank you for your efforts.
By the way, have you seen the work of Taha Zerrouki. He has produced many open source tools to deal with Arabic language [1]. Thought you might be interested.
My website was originally a Flask web app which used his Arabic number converter, Arabic stemmer, Arabic normaliser and other modules. But in order to use offline as a PWA I end up porting these to JavaScript. I will release the source code once tested properly and porting is completed.
Also just going through your blog, enjoying the summary of Abul Hasan Ali Nadwis book. Keep up the good work!
Edit: Also Taha's work is amazing and on the back of that I've been creating a programing language specifically for string processing, so it can be used for stemming algorithms etc which compiles to multiple other languages.
The idea is to hopefully port Taha's work over and hopefully benefit communities of other programing languages.
That will definitely be useful for students, especially some of my classmates.
Is this open source? If so i may contribute where possible
Also, a feature request (if your taking them) is a vocabulary learner and a way to test yourself and mark your progress.
My Toyota Corolla with Toyota Safety Sense 2.0 (fitted in nearly every Toyota post 2018/2019) has stop and start, either press resume if you've stopped for more than 5 seconds or tap the accelerator, and that's with lane tracing assist as well. I don't think there's a second I don't have both Adaptive Cruise Control and Lane Tracing Assist activated.
Edit: You can also accelerate without disabling it.
I feel like I missed a huge generational gap driving a 2017 toyota. None of this lane assist, no backup camera, the only new bell and whistle I get that's different from my car 20 years ago is that this one beeps at me like I have a missile locked on my tail when someone is too close in front of me.
Best part is it's standard on nearly every model. I don't think there's a minute when I don't have it enabled and it's full range so works at any speed and road. The millimetre wave radar is great when road markings aren't visible or when cars are manoeuvring around parked cars or other obstacles.
It's a great book. Bob's not a CS teacher, and it's great: the book is grounded in practice and has been structured in a way that lets you get a tangible result at the end of every chapter.
I second this. It's a great resource, very well written.
Opinionated enough to have character and humour, but not so dry as to make your eyes glaze over. I guess I'm saying that for a technical tome it's very, very readable (as was his Game Design Patterns book incidentally)
A virtual machine executes instructions while an interpreter takes source code as input. An interpreter could be built on top of a virtual machine, obviously, but not necessary. For example, SICP/PLAI/OOPLAI[1] explain how to implement an interpreter on top of Scheme where you directly interpret the s-expressions. These may be a worth read if you want to learn about the usual concepts and techniques used in programming language implementations from a more general point of view. Like, for example, how to implement a static type system or an object model based on classes.
Interpreters based on a virtual machine are actually compilers on the fly; the source code is not interpreted but translated into instructions beforehand.
Interpreters usually execute by walking the abstract syntax tree at runtime, bytecode VMs usually execute actual bytecode similar to machine code. The bytecode for such VMs is often way simpler than the instruction set for an actual computer, so it's far more portable between platforms.
Keep in mind that bytecode interpreted languages (Ruby, Python) are typically called interpreted languages. Java is usually called "compiled" because of the explicit step vs. Ruby and Python, but it's essentially the same. And typically you'll find discussions wrt to JVM about interpretation in Java referring to bytecode interpretation vs. compiling being JIT compiling.
Ultimately limiting interpreters to AST interpreters is not quite correct. The AST is just another IR that source code needs to be converted to just like bytecode.
And the AST is essentially also executed by a virtual machine. Interpretation of the IR (the AST, or bytecode, etc.) is one part of a VM. Of course in some instances the VMness of a runtime is more pronounced (Java) than in others (Ruby).
The difference between interpretation and compilation is that the latter is meant to run on real hardware vs. the former implies something that executes the IR of a program by dynamically choosing which machine instructions to run.
Of course a compiler is also something that takes in some code and outputs some other, typically more low level representation.
My point being I don't think there is a strict or consistent definition these days for what an interpreter is.
Case in point: I've also heard people say interpreters read the code "line by line" (or rather, not that but more accurately, as far as they need to read before they know what to execute next) and execute it piece by piece. Which might be how some interpreters worked in some languages in the past. An AST interpreter OTOH already implies some whole source file pre-processing. Is an AST interpreter then less of an interpreter than one that streams the source code line by line. Is a program that does that more an interpreter than another which goes a step further and, e.g. executes a simplified AST, or one that executes a straightforward trivial bytecode representation of the simplified AST?
> Interpreters usually execute by walking the abstract syntax tree at runtime, bytecode VMs usually execute actual bytecode similar to machine code.
This isn't the right way to separate these concepts. A VM that executes bytecodes by interpretation is also an interpreter (Python, Ruby, PHP are well-known examples). Many VMs don't interpret, or they interpret mostly during startup, but actually try to compile hot parts of the code and execute that machine code rather than interpreting (Java and JavaScript VMs are well-known examples).
The VM part more properly refers to things like data representation and memory management. Interpreters of an abstract syntax tree, interpreters of bytecodes, and just in time compilers all need to run inside a system that takes care of loading code, allocating memory when requested, and often doing garbage collection. These services are what make a VM. The exact execution strategy is not what determines VM-ness.
Which languages or implementations of languages directly interpret the AST without the intermediary bytecode compilation?
I know Python, Java and JavaScript (V8 and SpiderMonkey) all compile to bytecode first probably to speed up subsequent runs and run some optimisations on the bytecode.
What other benefits are there to compiling to bytecode first vs directly interpreting the AST?
One major benefit of compiling to bytecode first is that bytecode is a more convenient shared source of truth.
For example, SpiderMonkey has two interpreter and two compiler tiers. The output of the parser is bytecode. We can interpret it immediately, and it's a convenient input to the compilers. It also simplifies transitions between tiers: for example, when we bail out of Warp, we resume at a specific bytecode offset in the baseline interpreter.
I'm not sure how you would resume at a particular point in the execution of an AST-based interpreter without agreeing on a linearized traversal of the AST, which is 90% of the way to bytecode.
If you compile to AST and walk that then your portability is at the source level; you have to send the source over the wire; each target then needs a parser+walker and each target needs to parse+walk. If you compile to bytecode you can send bytecode over the wire and then simply interpret that bytecode.
Portability.
Say I wanted to make language X run on all platforms, but I didn't actually care about compiling it on all platforms. I can just write a relatively simple VM for each platform. This is one of the reasons Java was and still kinda is so ubiquitous
Why would it be less work? The interpreter will need to implement whatever operations a VM can perform, so a priori it's at least as much work. Bonus, if you can bootstrap the source->bytecode process, then you only need to write (and compile) that once to get a full-fledged interpreter on every host with a VM
As others mentioned, source code should be distributed that way, and I think creating a simple VM is easier than a simple language parser. But of course, an optimizing one can be really quite complex in both cases.
Gambit Scheme, CLISP, CMUCL are capable of interpreting ASTs, and I believe (although I'm not 100% sure) that this is the case for Lispworks and Allegro CL as well. Also, Ruby until version 1.8.
These sentiments are generally without an understanding of the historical context and from taking sloppily translated texts to skew the meaning. Just search in HN for Islam, Christianity, Quran, Torah or Bible and you will find people copypasting these mistranslated and misunderstood texts in comments feeling like they're the biggest scholars of the century.
Sure, actual academic discussions are fine with substance and evidences to support them where the author has put some effort in and has an actual understanding of the subject matter. But instead there is an attitude of oh I'll Google for something and copy paste it.
Understanding the Quran literally is not the only way Islamic law is derived.
According to the four Schools of Jurisprudence in the Sunnis and the Ja'fari Schools of Jurisprudence in the Shias Islamic law is derived from the following sources:
- Quran and Quranic exegesis
- Ahaadith (prophetic traditions)
- Ijmaa' (consensus)
- Qiyaas (analogy - limited use cases and remit)
This is a ruling that is held by each of the Schools of Jurisprudence in the Sunnis and the Ja'fari School of Jurisprudence of the Shias.
As a side note the modern day Salafis/Wahabis/Ahlul Hadith literally interpret the Quran and Ahaadith (Prophetic Traditions) which is how they reach rulings that differ from the accepted Schools of Jurisprudence that have had millions of scholars over centuries work together.
https://github.com/hyperium/hyper/issues/1818