What is the difference between syntax and semantics in programming languages?

268,935

Solution 1

TL; DR

In summary, syntax is the concept that concerns itself only whether or not the sentence is valid for the grammar of the language. Semantics is about whether or not the sentence has a valid meaning.

Long answer:

Syntax is about the structure or the grammar of the language. It answers the question: how do I construct a valid sentence? All languages, even English and other human (aka "natural") languages have grammars, that is, rules that define whether or not the sentence is properly constructed.

Here are some C language syntax rules:

  • separate statements with a semi-colon
  • enclose the conditional expression of an IF statement inside parentheses
  • group multiple statements into a single statement by enclosing in curly braces
  • data types and variables must be declared before the first executable statement (this feature has been dropped in C99. C99 and latter allow mixed type declarations.)

Semantics is about the meaning of the sentence. It answers the questions: is this sentence valid? If so, what does the sentence mean? For example:

x++;                  // increment
foo(xyz, --b, &qrs);  // call foo

are syntactically valid C statements. But what do they mean? Is it even valid to attempt to transform these statements into an executable sequence of instructions? These questions are at the heart of semantics.

Consider the ++ operator in the first statement. First of all, is it even valid to attempt this?

  • If x is a float data type, this statement has no meaning (according to the C language rules) and thus it is an error even though the statement is syntactically correct.
  • If x is a pointer to some data type, the meaning of the statement is to "add sizeof(some data type) to the value at address x and store the result into the location at address x".
  • If x is a scalar, the meaning of the statement is "add one to the value at address x and store the result into the location at address x".

Finally, note that some semantics can not be determined at compile-time and therefore must be evaluated at run-time. In the ++ operator example, if x is already at the maximum value for its data type, what happens when you try to add 1 to it? Another example: what happens if your program attempts to dereference a pointer whose value is NULL?

Solution 2

Syntax refers to the structure of a language, tracing its etymology to how things are put together.
For example you might require the code to be put together by declaring a type then a name and then a semicolon, to be syntactically correct.

Type token;

On the other hand, the semantics is about meaning. A compiler or interpreter could complain about syntax errors. Your co-workers will complain about semantics.

Solution 3

Semantics is what your code means--what you might describe in pseudo-code. Syntax is the actual structure--everything from variable names to semi-colons.

Solution 4

Wikipedia has the answer. Read syntax (programming languages) & semantics (computer science) wikipages.

Or think about the work of any compiler or interpreter. The first step is lexical analysis where tokens are generated by dividing string into lexemes then parsing, which build some abstract syntax tree (which is a representation of syntax). The next steps involves transforming or evaluating these AST (semantics).

Also, observe that if you defined a variant of C where every keyword was transformed into its French equivalent (so if becoming si, do becoming faire, else becoming sinon etc etc...) you would definitely change the syntax of your language, but you won't change much the semantics: programming in that French-C won't be easier!

Solution 5

  • You need correct syntax to compile.
  • You need correct semantics to make it work.
Share:
268,935

Related videos on Youtube

haccks
Author by

haccks

Say no to WhiteHatJR 87th to the c gold badge. (Awarded on 10/01/2014) 9th to the pointer gold badge. (Awarded on 23/07/2015) My life: "That fondness for science, ... that affability and condescension which God shows to the learned, that promptitude with which he protects and supports them in the elucidation of obscurities and in the removal of difficulties, has encouraged me to compose a short work on calculating by al-jabr and al-muqabala , confining it to what is easiest and most useful in arithmetic." Abu Ja'far Muhammad ibn Musa Al-Khwarizmi [Born: about 780 in Baghdad (now in Iraq). Died: about 850] [al-jabr means "restoring", referring to the process of moving a subtracted quantity to the other side of an equation; al-muqabala is "comparing" and refers to subtracting equal quantities from both sides of an equation.]

Updated on November 11, 2021

Comments

  • haccks
    haccks over 2 years

    What is the difference between syntax and semantics in programming languages (like C, C++)?

    • null
      null almost 4 years
      I would like to up vote but no research effort is evident.
  • haccks
    haccks almost 11 years
    OK. If x is at the maximum value for its data and 1 is added to it then it results in some weird output (0), isn't it semantic error?
  • Jeff N
    Jeff N almost 11 years
    Consider an odometer in a vehicle -- it has a series of interrelated wheels with the digits 0 through 9 printed on each one. The rightmost wheel rotates the fastest; when it wraps from 9 back to zero, the wheel to its immediate left advances by one. When this wheel advances from 9 to 0, the one to its left advances, and so on.
  • Jeff N
    Jeff N almost 11 years
    A datatype is like the wheel of an odometer: it can only hold up to a certain value. When the maximum value is reached, the next advance causes the wheel to return to zero. Whether or not this is a semantic error depends on the language rules. In this case, you need to refer back to the C language standard. I don't know exactly what the C language standard says, but here are some of the options. Overflow is: -not an error; the result is zero. -an error; the compiler MUST generate an overflow exception. -UNDEFINED;the compiler is free to do whatever it wants.
  • Daniel H
    Daniel H about 10 years
    In case anybody cares about the specific example, unsigned overflow is defined as modular arithmetic (so UINT_MAX + 1 == 0). Signed overflow is undefined. Modern compilers usually have INT_MAX + 1 == INT_MIN, but there are cases you can't count on this (e.g. for (i = 0; i <= N; ++i) { ... } where N is INT_MAX is not infinite depending on optimization; see blog.llvm.org/2011/05/what-every-c-programmer-should-know.ht‌​ml).
  • doctorlove
    doctorlove over 9 years
    @Talespin_Kit meaning rather than structure: logic is more an abstraction e.g. P => Q, etc or !!P = P, but when you add semantics things can have subtlety, if P is "happy", then !!P is "I'm not un-happy" != "I'm happy"
  • Jagdish
    Jagdish over 8 years
    +1 for "A compiler or interpreter could complain about syntax errors. Your co-workers will complain about semantics."
  • doubleOrt
    doubleOrt almost 6 years
    Is it a conversation between different people ? Or is it just one post ? I don't get it. E.g "No idea what the following is supposed to mean. It couldn't be more wrong".
  • ymln
    ymln over 5 years
    "note that some semantics cannot be determined at compile-time and must therefore must be evaluated at run-time" - I like how this has a parallel to natural languages. You can't know the meaning of some phrases without context. For example, in the phrase "He likes bananas" the meaning of "he" depends on context.
  • Vedant Panchal
    Vedant Panchal almost 4 years
    Compilers This link might be helpful to learn more
  • help-info.de
    help-info.de over 3 years
    Welcome to Stack Overflow. Before answering an old question having an accepted answer (look for green ✓) as well as other answers ensure your answer adds something new or is otherwise helpful in relation to them. Here is a guide on How to Answer.
  • hack3r-0m
    hack3r-0m about 3 years
    what about interpreted languages?
  • Vedant Panchal
    Vedant Panchal about 3 years
    A good question! But I don't think I can answer that. In my mind, basically, the same language can be either interpreted or compiled, based on the tool (realtime/interactive or compiler). Still, in the traditional sense, the answer helps to give an idea about any form of language.
  • haccks
    haccks over 2 years
    Nicely explained! Last para is the sum up.
  • Ta Thanh Dinh
    Ta Thanh Dinh about 2 years
    Both phrases are wrong. E.g. ``` int foo() { int x; return &x; } ``` is syntactically correct (but not compile). A fix (i.e. make the semantics correct) by changing the type of foo to int* foo(), makes the function buggy (i.e. doesn't work) since returning a dangling pointer.
  • meaning-matters
    meaning-matters about 2 years
    @TaThanhDinh The phrases are correct. There are of course more ways to mess up. I've kept my answer short and clear.
  • Ta Thanh Dinh
    Ta Thanh Dinh about 2 years
    I know that you've used metaphors (to keep the answer short), but saying about the correctness of metaphors is difficult.