OSDN > Desarrollador > albertmietus > Chamber > DocIdeas > Commit

DocIdeas
Fork

(Original repository, No fork origin)

Commit

Commit MetaInfo

Revisión	865669d7dc68aa11fe701358e427a65b1a4e4091 (tree)
Tiempo	2023-05-23 03:46:46
Autor	Albert Mietus < albert AT mietus DOT nl >
Commiter	Albert Mietus < albert AT mietus DOT nl >

Log Message

Grammerly

Cambiar Resumen

modified: CCastle/1.Usage/1.CompilerCompiler.rst (diff)

Diferencia incremental

diff -r cae120b77758 -r 865669d7dc68 CCastle/1.Usage/1.CompilerCompiler.rst

--- a/CCastle/1.Usage/1.CompilerCompiler.rst Sat Apr 08 23:21:49 2023 +0200

+++ b/CCastle/1.Usage/1.CompilerCompiler.rst Mon May 22 20:46:46 2023 +0200

		@@ -11,7 +11,7 @@
11	11	:category: Castle, Usage
12	12	:tags: Castle, Grammar
13	13
14		- In Castle you can define a grammar directly in your code. The compiler will translate them into functions, using
	14	+ In Castle, you can define grammar(s) directly in your code. The compiler will translate them into functions, using
15	15	the build-in (PEG) compiler-compiler -- at least that was it called back in the days of YACC.
16	16
17	17	How does one use that? And why should you?

		@@ -21,7 +21,7 @@
21	21	=======================
22	22
23	23	A grammar is a collection of (parsing)-rules and optionally some settings. Rules are written in a mixture of EBNF
24		-and PEG meta-syntax. Let’s start with an simple example:
	24	+and PEG meta-syntax. Let’s start with a simple example:
25	25
26	26	.. code-block:: PEG
27	27

		@@ -33,132 +33,131 @@
33	33
34	34
35	35	This basically defines that a ``castle_file`` is a sequence of ``import_line``\(s), ``interface``\(s), or
36		-``implementation``\(s); which are all “non-terminal(s)” -- see below. Each of those non-terminals are defined by more
37		-rules. By example, an ``import_line`` starts with the ``IMPORT_stmt`` then comes either a ``STRING_literal`` or a
	36	+``implementation``\(s); which are all “non-terminal(s)” -- see below. All of those ’non-terminals’ are defined by more
	37	+rules. For example, an ``import_line`` starts with the ``IMPORT_stmt`` then comes either a ``STRING_literal`` or a
38	38	``qualID``, and it ends with a semicolon (`';'`). Likewise, the ``IMPORT_stmt`` is set to ‘import’ (literally).
39	39
40		-As we see, a grammar contains non-terminals and terminals. Non-terminals are abstract and defined by grammar-rules,
41		-containing both (other) non-terminals and terminals. Terminals are concrete: it are the things (tokens) you type when
	40	+As we see, the grammar contains non-terminals and terminals. Non-terminals are abstract and defined by grammar rules,
	41	+containing both (other) non-terminals and terminals. Terminals are concrete: they are the things (tokens) you type when
42	42	programming. Some terminals are like constants like the semicolon at the end of ``import_line``, therefore they are
43		-quoted in the grammar (Notice, the is also a non quoted semicolon on each line, that is part of the syntax of grammar.)
	43	+quoted in the grammar (Notice, the is also a non-quoted semicolon on each line, which is part of the syntax of grammar.)
44	44	\|BR\|
45		-Other terminals are more like valuables, they have a value. The ``STRING_literal`` is a good example. It’s value is the
46		-string itself. Similar for numbers and variable-names.
	45	+Other terminals are more like valuables, they have a value. The ``STRING_literal`` is a good example. Its value is the
	46	+string itself. Similar for numbers and variable names.
47	47
48		-In this example grammar, a ``qualID`` is a ``nameID`` (a name that is used as ID, like in any programming language),
	48	+In this (example) grammar, a ``qualID`` is a ``nameID`` (a name that is used as ID, like in any programming language),
49	49	optionally followed by sub-names *(again like most languages: a dotted name, specifying a field (in a field, in
50	50	...)*. In Castle, that name may start with a dot --which is a shorthand notation for “in the current namespace”. You can
51		-ignore that for know.
	51	+ignore that for now.
52	52
53		-Basically, grammers defines how one should read the input --a text--, or more formally: how to parse it. The result of
54		-this parsing is twofold. It will check whether input conforms to the grammer; resulting in boolean, for the
55		-mathematics under us. And it will translate a sequential (flat) text into a tree-structure; which typically much more
56		-useful for a software-engineer.
	53	+A grammar defines how one (aka the compiler) should read the input --a text--, or more formally: how to parse it. The
	54	+result of this parsing is twofold. It will check whether the input conforms to the grammar; resulting in a boolean, for
	55	+the mathematics under us. And it will translate a sequential (flat) text into a tree-structure; which is typically much
	56	+more useful for a software engineer.
57	57	\|BR\|
58		-A well known example is this HTML-file. On disk it’s nothing but text, which is easy to store and to transfer. But when
59		-send to your brouwer, it’s parsed to create the `DOM <https://nl.wikipedia.org/wiki/Document_Object_Model>`__; a
60		-tree of the document, with sections, paragraphs, hyper-links, etc. By regarding it as a tree, it easy to describe
61		-(e.g. with CSS) how arts should be shown: all headers have a background, the first row in a table is highlighed,
62		-etc.
63	58
	59	+A well-known example is this HTML file. On disk, it’s nothing but text, which is easy to store and transfer. But when
	60	+sent to your browser, it’s parsed to create the `DOM <https://nl.wikipedia.org/wiki/Document_Object_Model>`__; a tree
	61	+of the document, with sections, paragraphs, hyperlinks, etc. By regarding it as a tree, it becomes easy to describe or
	62	+selected parts (e.g. with CSS) how parts should be shown: all headers have a background, the first row in a table is
	63	+highlighted, etc.
64	64
65	65	Parsing
66	66	=======
67	67	Another well-known example is (the source of a) program. As code, it is just text. But the compiler will parse it into
68		-a parse-tree and/or an abstract-syntax-tree; which is build out of classes, methods, statements etc.
	68	+a parse tree and/or an “abstract syntax tree” (AST); which is built out of classes, methods, statements, etc.
69	69	\|BR\|
70	70	But also your favorite IDE will parse it; to highlight the code, give tooltips, enable you to quickly navigate and
71		-refactor it and all those conviant features that make it your favorite editor.
72		-
	71	+refactor it, and all those convenient features that make it your favorite editor.
73	72	And even you are probably parsing text as part of your daily job. When you un-serialise data, you are (often) parsing
74		-text; when you read the configuration, you are (or should be ) parsing that text. Even a simple input of the user might
	73	+text; when you read the configuration, you are (or should be ) parsing that text. Even a simple input from the user might
75	74	need a bit of parsing. The text “42” is not the number :math:`42.0` -- you need to convert it; parse it.
76	75
77		-There a many ways to parse. You do not need a full-fledged grammer to translate “42” into :math:`42` or
78		-:math:`42.0`; a stdlib-function as ``atoi()`` or ``atof()`` will do. But how about handling complex numbers
79		-(:math:`4+j2`) or fractions (:math:`\frac{17}{42}`)?
	76	+There are many ways to parse. A full-fledged grammar to translate (the text) ‘42’ into the int “:math:`42`” or
	77	+the float “:math:`42.0`” isn’t needed, a stdlib-function as “``atoi()` or ``atof()`` will do. But how about handling
	78	+complex numbers (:math:`4+j2`) or fractions (:math:`\frac{17}{42}`)?
80	79
81	80	Non-parsing
82	81	-----------
83	82
84	83	As writing a proper-passer used to be (too) hard, other similar (but simpler) techniques are often used, like `globing
85	84	<https://en.wikipedia.org/wiki/Glob_(programming)>`__ (``\.Castle`` on the bash-prompt will result in all
86		-Castle-files)*. This is simple, and will is very simple cases.
	85	+Castle-files)*. This is simple and will do in very simple cases.
87	86	\|BR\|
88		-Other try to use `regular-expressions <https://en.wikipedia.org/wiki/Regular_expression>`__ for parsing. RegExps are
89		-indeed more powerfull then globing, and often used to highlight code. A pattern as ``//.*$`` can be used to highlight
90		-(single-line) comment. It often works, but this simple pattern might match a piece of text inside a
91		-multi-line-(doc)string -- which wrong.
	87	+Others try to use `regular expressions <https://en.wikipedia.org/wiki/Regular_expression>`__ for parsing. RegExps are
	88	+indeed more powerful than globing and are often used to highlight code. A pattern as ``//.*$`` can be used to highlight
	89	+(single-line) comments. It often works, but this simple pattern might match a piece of text inside a
	90	+multi-line-(doc)string -- which is wrong.
92	91
93		-To parse an input-text its not a sound solution; although I have seen cunning regular-expressions, that almost always
94		-work. But reg-exps have not the same power as a grammar-- That is already proven halve a century ago and will not be
95		-repeated here.
	92	+Those tricks aren’t a sound solution to parse generic input/text; although I have seen cunning RegExps that almost
	93	+(always) work. Regular expressions do have not the same power as grammars; that is already proven half a century
	94	+ago and not repeated here.
96	95
97		-Grammars are more powerfull
98		-===========================
	96	+Grammars are more powerful
	97	+==========================
99	98
100		-A grammar (even a simple one) is more powerfull. You can define the overal structure of the input and the sub-structure
101		-of each lump. When a multi-line-string has no sub-structure, the parser will never find comments inside it. Nor other
102		-way around; it simple is not hunting for it.
	99	+A grammar (even a simple one) is more powerful. You can define the overall structure of the input and the sub-structure
	100	+of each lump. When a multi-line-string has no sub-structure, the parser will never find comments inside it. Nor the other
	101	+way around; it simply is not hunting for it.
103	102
104		-As most programming-languages do not have build-in support for grammars, one has to resort to external tools. Like the
105		-famous `YACC <https://en.wikipedia.org/wiki/Yacc>`__; developed in 197X. YACC will read a grammar-file, and generates
	103	+As most programming languages do not have built-in support for grammars, one has to resort to external tools. Like the
	104	+famous `YACC <https://en.wikipedia.org/wiki/Yacc>`__; developed in 197X. YACC will read a grammar-file and generates
106	105	C-code that can be compiled and linked to your code.
107	106
108		-Back then, writing compiler-compilers was a popular academic research exercise (YACC stand for: Yet Another Compiler
109		-Compiler). It was great for compiler-designers, but clumsy to use for average developers: The syntax to write a grammar
	107	+Back then, writing compiler-compilers was a popular academic research exercise (YACC stands for: Yet Another Compiler
	108	+Compiler). It was great for compiler designers, but clumsy to use for average developers: The syntax to write a grammar
110	109	was hard to grasp, with many pitfalls, the interface between your code and the parser was awkward (you had to call
111	110	``yyparse()``; needed some globals; OO wasn't invented, no inheritance or data-hiding, which resulted in puzzling tricks
112	111	to use multiple parsers, etc).
113	112	\|BR\|
114		-Aside of that, more and better parsing strategies are developed; that is handles in another :ref:`blog <grammmar-code>`.
	113	+Aside from that, more and better parsing strategies are developed; that is handled in another :ref:`blog <grammmar-code>`.
115	114
116	115	Unleash that power!
117	116	-------------------
118	117
119		-With those better parsing-algorithms, faster computers with a lot more memory and other inventions, writing grammars
120		-has become more peaceful. Except that you still need an extra step, another sytax, as you still need to use an external
	118	+With those better parsing algorithms, faster computers with a lot more memory, and other inventions, writing grammars
	119	+has become more peaceful. Except that you still need an extra step, another syntax, as you still need to use an external
121	120	tool. That sometimes isn’t maintained after a couple of years ...
122	121	\|BR\|
123		-The effect is, most developers don’t use grammars; they write parser-like code manually, or the settle for less optimal
124		-result. Or are utterly not aware that grammer can provide another, better, easier solution.
	122	+The effect is, most developers don’t use grammars; they write parser-like code manually, or they settle for less optimal
	123	+results. Or are utterly not aware that grammar can provide another, better, easier solution.
125	124
126	125	With a few lines, you can define the structure of the input. Each rule is like a function: it has a name (the
127		-left-hand-side of the rule, so the part before the arrow), and an implementation; the part after the arrow. That
128		-implementation “calls” other rules, like normal code.
	126	+‘left-hand side’ (LHS) of the rule, so the part before the arrow), and an implementation; the part after the
	127	+arrow. That implementation “calls” other rules, like normal code.
129	128	\|BR\|
130		-When you call the “main rule function”, with the input-stream as input, that file is parsed, and the complete input is
131		-ready to use; not more manual scanning and parsing. And when the file-structure is slightly updated, you just add a few
132		-details to the grammer.
	129	+When you call the “main rule function”, with the input stream as input, that file is parsed, and the complete input is
	130	+ready to use; not more manual scanning and parsing. And when the file structure is slightly updated, you just add a few
	131	+details to the grammar.
133	132
134		-Castle has it build-in
	133	+Castle has it built-in
135	134	======================
136	135
137	136	Grammars makes reading text easy. Define the structure, call the “main rule” and use the values. Castle makes that
138	137	simple!
139	138
140		-.. use:: Castle has build-in grammers support
141		- :ID: U_Grammers
142		-
143		- In Castle one can define a grammer directly into the source-code; as :ref:`grammmar-code`!
	139	+.. use:: Castle has build-in grammar support
	140	+ :ID: U_Grammars
144	141
145		- And, like many other details, the language is hiding the nasty details of parsing-strategies. There is no need to
146		- generating, compiling, and use that code, with external tools. All that clutter is gone.
	142	+ In Castle one can define a grammar directly into the source code; as :ref:`grammmar-code`!
147	143
148		- .. tip:: The standard parsing-algorithm is PEG; but that is not an requirement.
	144	+ And, like many other details, the language is hiding the nasty details of parsing strategies. There is no need to
	145	+ generate, compile, and use that code, with external tools. All that clutter is gone.
149	146
150		- The syntax of grammers is quite generic, it’s the implementation of the Castle-compiler that implements the
151		- parsing-strategy; it should supports PEG. But it is free to support others as well (with user-selectable
	147	+ .. tip:: The standard parsing algorithm is PEG, but that is not a requirement.
	148	+
	149	+ The syntax of grammars is quite generic, it’s the implementation of the Castle compiler that implements the
	150	+ parsing strategy; it should support PEG. But it is free to support others as well (with user-selectable
152	151	compiler-plugins).
153	152	\|BR\|
154	153	This is not unlike other compiler-options.
155	154
156		-To use the grammar you simply call one of those rules as a function: pass the input (string) and it will return a
157		-(generic) tree-structure.
158		-When you simple like to verify the syntax is correct: use the tree as a boolean: when it not-empty the input is valid.
	155	+To use the grammar, you simply call one of those rules as a function: pass the input (string) and it will return a
	156	+(generic) tree structure.
	157	+When you like to verify the syntax is correct: use the tree as a boolean: when it not-empty the input is valid.
159	158	\|BR\|
160		-Typically however, you traverse that tree, like you do in many situations.
	159	+Typically, however, you traverse that tree, like you do in many situations.
161	160
162	161	To read that early configuration: parse the file and walk over the tree. Or use it “ala a DOM” by using Castle’s
163		-:ref:`matching-statements` to simply. Curious on how that works: continue reading in :ref:`grammmar-code`.
	162	+:ref:`matching-statements` to simply. Curious about how that works: continue reading in :ref:`grammmar-code`.
164	163	Or skip to “Why there are :ref:`G2C-actions`”.

DocIdeas Fork