Programming language

Computer code (HTML with JavaScript) in a tool that uses syntax highlighting (colors) to help the developer see the purpose of each piece of code.

A programming language is a stylized communication technique intended to be used for controlling the behaviour of a machine (often a computer). Like human languages programming languages have syntactic and semantic rules used to define meaning.

Thousands^[1] of different programming languages have been created and new ones are created every year. (see list of programming languages). Few languages ever become sufficiently popular that they are used by more than a few people, but professional programmers are likely to use dozens of different languages during their career.

Definitions of programming language

There is no universally agreed definition for the term programming language. The following is a list of some of the methods that have been used to categorize a language as being a programming language.

What it is used for. For instance, a programming language is a language used to write programs.
Those involved in the interaction. For instance, a programming language differs from natural languages in that natural languages are used for interaction between people, while programming languages are used for communication from people to machines (this rules out languages used for computer to computer interaction).
The constructs it contains. For instance, a programming language contains constructs for defining and manipulating data structures, and for controlling the flow of execution.
Its expressive power. The theory of computation provides a classification of languages based on the range of computations expressible by them, with the most expressive language being one that is Turing complete (the language needs to contain at least a goto construct and a method of storing values, ie, variables). Any algorithm that can be implemented in a language that is Turing complete can also be implemented in another language that is Turing complete. Some examples of languages that are not Turing complete are HTML (although it can contain Turing complete languages such as PHP and Javascript) and SQL (SQL vendors invariably add language extensions that create a Turing complete language, e.g., PL/SQL).

Specification

The specification of a language can take several forms, including:

a description of the language's syntax and semantics (e.g., C language). The latter usually lists the results of performing an operation and restrictions on the kinds of operations that can be performed.
a description of the behavior of a translator (e.g., C++). The syntax and semantics of the language can be inferred from this.
the language is defined in terms of an executable specification (e.g., Prolog), often written in that language.

Features of programming language

Each programming language can be thought of as a set of formal specifications concerning syntax, vocabulary, and meaning.

These specifications usually include:

Type system
Data structures
Instruction and control flow
Design philosophy
Compilation and interpretation

Those languages that are widely used – or have been used for a considerable period of time – have standardization bodies that meet regularly to create and publish formal definitions of the language and discuss the extension of existing definitions.

Type system

A type system defines how a programming language classifies values and variables into types, how it can manipulate those types and how they interact. The design and study of type systems is known as type theory.

Internally, all data in modern digital computers are stored simply as zeros or ones (binary). The data typically represent information in the real world such as names, bank accounts and measurements, so the low-level binary data are organized by programming languages into these high-level concepts as data types. There are also more abstract types whose purpose is just to warn the programmer about semantically meaningless statements.

Languages can be classified with respect to their type systems.

static vs. dynamic typing: This distinction refers to when values have a type associated with them. In a dynamically typed language types are an attribute of data, and thus a variable acquires a type by being assigned a particular value at runtime. In statically typed languages, variables are given a type by the programmer at compile time, and data assigned to the variable must match its type.
strong vs. weak typing: This distinction refers to the behavior of the language when values are operated on. In a strongly typed language, operations must match the types of the values operated on (addition must be performed on numbers, for example). In weakly typed languages operations might be made on values which are inappropriate.

Type inference is a mechanism whereby the type specifications can often be omitted completely, if it is possible for the compiler to infer the types of values from the contexts in which they are used – for example, if a variable is assigned the value 1, a type-inferring compiler does not need to be told explicitly that the variable is an integer. There are however many different uses for integers; it might e.g. make sense in a program to prevent inadvertent adding of a phone number to the number of apples in a box. Therefore some languages such as Ada allow defining different kinds of incompatible integers; this is called strong typing. Type-inferred languages can be more flexible to use, particularly when they also implement parametric polymorphism.

It is possible to perform type inference on programs written in a dynamically typed language, but it is entirely possible to write programs in these languages that make type inference infeasible. Sometimes dynamically typed languages are called latently typed.

Note that strong vs. weak is a continuum; Java is a strongly typed language relative to C, but is weakly typed relative to ML. Use of these terms is often a matter of perspective, much in the way that an assembly language programmer would consider C to be a high-level language while a Java programmer would consider C to be a low-level language.

Strong and static are orthogonal concepts. See the cross reference table below. But beware that some people incorrectly use the term strongly typed to mean strongly, statically typed, or, even more confusingly, to mean simply statically typed – in the latter usage, C would be called strongly typed, despite the fact that C doesn't catch that many type errors and that it's both trivial and common to defeat its type system (even accidentally).

Aside from when and how the correspondence between expressions and types is determined, there's also the crucial question of what types the language defines at all, and what types it allows as the values of expressions (expressed values) and as named values (denoted values). Low-level languages like C typically allow programs to name memory locations, regions of memory, and compile-time constants, while allowing expressions to return values that fit into machine registers; ANSI C extended this by allowing expressions to return struct values as well (see record). Functional languages often restrict names to denoting run-time computed values directly, instead of naming memory locations where values may be stored, and in some cases refuse to allow the value denoted by a name to be modified at all. Languages that use garbage collection are free to allow arbitrarily complex data structures as both expressed and denoted values.

Finally, in some languages, procedures are allowed only as denoted values (they cannot be returned by expressions or bound to new names); in others, they can be passed as parameters to routines, but cannot otherwise be bound to new names; in others, they are as freely usable as any expressed value, but new ones cannot be created at run-time; and in still others, they are first-class values that can be created at run-time.

Type system cross reference list

This is a comparison of the features of the type systems and type checking of multiple programming languages.

Brief definitions

A nominal type system means that the language decides whether types are compatible and/or equivalent based on explicit declarations and names.
A structural type system means that the language decides whether types are compatible and/or equivalent based on the definition and characteristics of the types.
Type checking determines whether and when types are verified. Static checking means that type errors are reported based on a program's text (source code). Dynamic checking means that type errors are reported based on a program's dynamic (run-time) behavior.

Language	Type safety	Type expression	Type compatibility and equivalence	Type checking
A+	strong			dynamic
ActionScript 3.0	strong	implicit with optional explicit typing		static
ABC	strong
ABAP	strong		nominal	static
Ada	strong^{[TS 1]}	explicit	nominal	static
Agda	strong		nominal	static
Aldor	weak	partially implicit^[2]		static
Alef	strong			static
ALGOL 58	strong	explicit		static
ALGOL 60	strong	explicit		static
ALGOL 68	strong	explicit	structural	static & tagged unions
ALGOL W	strong			static
Alice	strong	implicit with optional explicit		static
Alma-0				static
AmbientTalk	strong			dynamic
AMOS BASIC				static
AngelScript	strong			static
APL	strong			dynamic
AppleScript	weak			dynamic
Arc				dynamic
Assembly	?	?	?	?
AutoHotkey	typeless	—	—	—
AutoLISP				dynamic
Ateji PX	strong	explicit	nominal	static
AWK	weak	implicit		dynamic
B	typeless
Ballerina	strong		structural	static
Bash	?	?	?	?
BASIC	strong	explicit	nominal	static
BCPL	typeless
BeanShell	strong		nominal	dynamic
BLISS	typeless	—	—	—
Boo	strong	implicit with optional explicit typing		static with optional dynamic typing
Bro	strong	implicit with optional explicit typing	nominal	static
C	weak	explicit	nominal	static
C--	weak			static
C++ (ISO/IEC 14882)	strong	explicit with optional implicit typing (by using auto in C++11)	nominal	static^{[TS 2]}
C*	weak	explicit		static
C#	strong ^[3]	implicit with optional explicit typing	nominal	static^{[TS 3]}
C shell	?	?	?	?
Caml	strong	implicit with optional explicit typing		static
Cecil				dynamic with optional static typing
Clean	strong	implicit		static
Ceylon	strong			static
Chapel		implicit with optional explicit typing		static
CHILL	strong			static
ChucK	strong
Cilk	weak	explicit		static
Claire	strong	implicit with optional explicit typing		dynamic with optional static typing
Clean	strong			?
Clojure	strong	implicit with optional explicit typing		dynamic
CLU	strong
COBOL	strong	explicit	nominal	static
Cobra	strong	explicit with optional implicit typing		static with optional dynamic typing
CoffeeScript		implicit		dynamic
ColdFusion (CFML)	strong	implicit		dynamic
COMAL	strong
Common Lisp	strong	implicit with optional explicit typing	structural for implicit typing, nominal for explicit typing	dynamic, some static checking(depending on implementation)
Component Pascal	strong			static
Cool	strong	explicit		static
CORAL	strong			static
Crystal		implicit with optional explicit typing^[4]	structural	static
Cuneiform		explicit		static
Curl	strong		nominal
Curry	strong	implicit with optional explicit typing		static
Cython	strong	implicit with optional explicit typing	nominal (extension types) and structural (Python)	dynamic with optional static typing
D	weak^{[TS 4]}	explicit	nominal	static
Dart	strong^[5]	gradual typing	nominal	static with optional dynamic typing
Dylan	strong			dynamic
Eiffel	strong		nominal	static
Elixir	strong	implicit		dynamic
Erlang	strong	implicit		dynamic
Euphoria	strong	explicit, implicit with objects	nominal	static, dynamic with objects
F#	strong	implicit	nominal	static
Forth	typeless	—	—	—
Fortran	strong	explicit^{[TS 5]}	nominal	static
Gambas	strong	explicit	nominal
GLBasic	strong	explicit. Non-explicit declarations available through project options	nominal	static
Gleam	strong	implicit with optional explicit	nominal	static
Go^[6]	strong	partially implicit (local type inference)	structural	static
Gosu	strong	partially implicit (local type inference)	nominal (subclassing) and structural	static
Groovy	strong	implicit with optional explicit typing		dynamic with optional static typing
Harbour	strong	implicit with optional explicit typing		dynamic
Haskell	strong	implicit with optional explicit typing	nominal^[7]^[8]	static
Haxe	strong	implicit with optional explicit typing	nominal (subclassing) and structural	static with optional dynamic typing
Io	strong	implicit		dynamic
icon	strong	implicit		dynamic
ISLISP	strong			dynamic
J	strong			dynamic
Java	strong^[9]	explicit	nominal	static
JavaScript	weak	implicit	—	dynamic
Julia	strong	implicit with optional explicit typing^[10]	structural for implicit typing, nominal for explicit typing	dynamic
Joy	strong			dynamic
Kotlin	strong	partially implicit (local type inference)	nominal	static
LabVIEW	strong
Lua	strong	implicit		dynamic
Maple	strong			dynamic
Mercury	strong			static
Mathematica	strong			dynamic
MATLAB M-code	strong			dynamic
Modula-2	weak^{[TS 4]}	explicit	nominal	static
Modula-3	weak^{[TS 4]}	explicit	structural	static
MUMPS (M)	typeless	—	—	—
Neko				dynamic
Nemerle	strong	implicit	nominal	static
NetLogo	strong	implicit		dynamic
NetRexx	strong	implicit with optional explicit		dynamic with optional static typing
newLisp		implicit		dynamic
NEWP	strong			static
Newspeak				dynamic
NewtonScript				dynamic
Nial				dynamic
Nim	strong	partially implicit (type inference)		static
Nickle	strong
Nu				dynamic
Oberon	strong	explicit	nominal	static and partially dynamic^{[TS 6]}
Objective-C	weak	explicit	nominal	dynamic with optional static typing^[11]
OCaml	strong	implicit with optional explicit typing	nominal for records,^[12] structural for objects^[8]^[13]	static
Object Pascal	strong	explicit	nominal	static
Opa	strong	implicit with optional explicit typing	structural	static
Oxygene	weak	implicit		static
Oz-Mozart	strong	implicit	structural	dynamic
Pascal	weak^{[TS 4]}	explicit	nominal	static
Perl 5		implicit		dynamic
PHP	weak	implicit with optional explicit typing	nominal	dynamic
Plus	strong	explicit	structural	static, dynamic (optional)
Prolog				dynamic
Pure				dynamic
PureScript	strong	implicit with optional explicit typing	nominal	static
Python	strong	implicit (with optional explicit typing as of 3.5)	nominal	dynamic
R		implicit		dynamic
Raku		partially implicit^{[TS 7]}		dynamic with optional static typing
REBOL	strong	implicit		dynamic
Rexx	typeless	—, implicit wrt numbers	—	static+dynamic wrt numbers
RPG	weak			static
Ruby	strong	implicit	—	dynamic
Rust	strong	explicit with optional implicit typing^[14]	mostly nominal	static
S				dynamic
S-Lang	strong	implicit		dynamic
Scala	strong	partially implicit (local type inference)	nominal (subclassing) and structural	static
Scheme	strong	implicit		dynamic (latent)
Seed7	strong	explicit	nominal	static
Simula	strong			static^{[TS 8]}
Smalltalk	strong	implicit		dynamic
Swift	strong	partially implicit (local type inference)	nominal (subclassing) and structural	static
Standard ML	strong	implicit with optional explicit typing	structural	static
Tcl				dynamic
TypeScript	strong	optional	structural	static
Unicon	strong	implicit		dynamic
Visual Basic	strong	implicit with optional explicit typing	nominal	static
Visual Basic (.NET)	weak^{[TS 4]}	explicit		static
Visual Prolog	strong	partially implicit	nominal	static
Wolfram Language	strong			dynamic
Windows PowerShell	strong	implicit		dynamic
XL	strong		nominal	static
Xojo	strong	explicit	nominal	static
XPath/XQuery	strong	partially implicit	nominal	dynamic with optional static typing
Language	Type safety	Type expression	Type compatibility and equivalence	Type checking

Notes

^ Unsafe operations are well isolated by a "Unchecked_" prefix.
^ with optional dynamic type casting (see dynamic cast)
^ with optional dynamic type (see dynamic member lookup)
^ ^a ^b ^c ^d ^e It is almost safe, unsafe features are not commonly used.
^ Optionally, typing can be explicitly implied by the first letter of the identifier (known as implicit typing within the Fortran community).
^ dynamic checking of type extensions i.e. inherited types
^ explicit for static types
^ optional for formal and virtual procedures

References

^ As of March 2006 The Encyclopedia of Computer Languages by Murdoch University, Australia lists 8276 computer languages.
^ Aldor User Guide (PDF). Aldor.org. 2002. pp. 40, 61. Retrieved 3 June 2021.
^ https://learn.microsoft.com/en-us/dotnet/csharp/fundamentals/types/
^ "Type Inference Crystal". Crystal Language Reference. Retrieved 3 June 2021.
^ "The Dart type system". dart.dev. Retrieved 2020-04-08.
^ The Go Programming Language Specification
^ Löh, Andres. "Why does Haskell not have records with structural typing?". Stack Overflow. Archived from the original on 2016-03-14. Retrieved 2020-04-13.
^ ^a ^b King, Alexis (2020-01-19). "No, dynamic type systems are not inherently more open". lexi-lambda.github.io. Archived from the original on 2020-03-01. Retrieved 2020-04-13.
^ Sheng Liang, Gilad Bracha. Dynamic class loading in the Java virtual machine. Volume 33, Issue 10 of ACM SIGPLAN Notices, October 1998.
^ "Types · the Julia Language". Archived from the original on 2018-07-24. Retrieved 2018-07-24.
^ Developer.apple.com Archived June 10, 2009, at the Wayback Machine
^ "Record · Reason". reasonml.github.io. Archived from the original on 2020-03-31. Retrieved 2020-04-13.
^ "Structural type system", Wikipedia, 2019-03-29, retrieved 2020-04-13
^ "rust-lang/rustc-dev-guide". GitHub. Retrieved 2020-04-08.

Data structures

Most languages also provide ways to assemble complex data structures from built-in types and to associate names with these new combined types (using arrays, lists, stacks, files).

Object oriented languages allow the programmer to define data-types called "Objects" which have their own intrinsic functions and variables (called methods and attributes respectively). A program containing objects allows the objects to operate as independent but interacting sub-programs: this interaction can be designed at coding time to model or simulate real-life interacting objects. This is a very useful, and intuitive, functionality. Languages such as Python and Ruby have developed as OO (Object oriented) languages. They are comparatively easy to learn and to use, and are gaining popularity in professional programming circles, as well as being accessible to non-professionals. It is commonly thought that object-orientation makes languages more intuitive, increasing the public availability and power of customized computer applications.

Instruction and control flow

Once data has been specified, the machine must be instructed how to perform operations on the data. Elementary statements may be specified using keywords or may be indicated using some well-defined grammatical structure.

Each language takes units of these well-behaved statements and combines them using some ordering system. Depending on the language, differing methods of grouping these elementary statements exist. This allows one to write programs that are able to cover a variety of input, instead of being limited to a small number of cases. Furthermore, beyond the data manipulation instructions, other typical instructions in a language are those used for control flow (branches, definitions by cases, loops, backtracking, functional composition).

Design philosophy

For the above-mentioned purposes, each language has been developed using a special design or philosophy. Some aspect or another is particularly stressed by the way the language uses data structures, or by which its special notation encourages certain ways of solving problems or expressing their structure.

Since programming languages are artificial languages, they require a high degree of discipline to accurately specify which operations are desired. Programming languages are not error tolerant; however, the burden of recognizing and using the special vocabulary is reduced by help messages generated by the programming language implementation.

There are a few languages which offer a high degree of freedom in allowing self-modification in which a program re-writes parts of itself to handle new cases. Typically, only machine language, Prolog, PostScript, and the members of the Lisp family (Common Lisp, Scheme) provide this capability. In MUMPS language this technique is called dynamic recompilation; emulators and other virtual machines exploit this technique for greater performance.

Compilation and interpretation

There are, broadly, two approaches to execute a program written in a given language. These approaches are known as compilation, done by a program known as a compiler; and interpretation, done by an interpreter. Some programming language implementations support both interpretation and compilation.

An interpreter parses a computer program and executes it directly. One can imagine this as following the instructions of the program line-by-line. In contrast, a compiler translates the program into machine code – the native instructions understood by the computer's processor. The compiled program can then be run by itself.

Compiled programs usually run faster than interpreted ones, because the overhead of understanding and translating the programming language syntax has already been done. However, interpreters are frequently easier to write than compilers, and can more easily support interactive debugging of a program.

Many modern languages use a mixture of compilation and interpretation. For example, the "compiler" for a bytecode-based language translates the source code into a partially compiled intermediate format, which is later run by a fast interpreter called a virtual machine. Some "interpeters" actually use a just-in-time compiler, which compiles the code to machine language immediately before running it. These techniques are often combined. Like other aspects of programming languages, "compiled" and "interpreted" may be best understood as opposite ends of a spectrum, rather than the only two options.

History of programming languages

The development of programming languages follows closely the development of the physical and electronic processes used in today's computers.

Programming languages have been under development for years and will remain so for many years to come. They got their start with a list of steps to wire a computer to perform a task. These steps eventually found their way into software and began to acquire newer and better features. The first major languages were characterized by the simple fact that they were intended for one purpose and one purpose only, while the languages of today are differentiated by the way they are programmed in, as they can be used for almost any purpose. And perhaps the languages of tomorrow will be more natural with the invention of quantum and biological computers.

Charles Babbage is often credited with designing the first computer-like machines, which had several programs written for them (in the equivalent of assembly language) by Ada Lovelace.

In the 1940s the first recognizably modern, electrically powered computers were created. Some military calculation needs were a driving force in early computer development, such as encryption, decryption, trajectory calculation and massive number crunching needed in the development of atomic bombs. At that time, computers were extremely large, slow and expensive: advances in electronic technology in the post-war years led to the construction of more practical electronic computers. At that time only Konrad Zuse imagined the use of a programming language (developed eventually as Plankalkül) like those of today for solving problems.

Subsequent breakthroughs in electronic technology (transistors, integrated circuits, and chips) drove the development of increasingly reliable and more usable computers. The first widely used high-level programming language was FORTRAN, developed during 1954–57 by an IBM team led by John W. Backus. It is still widely used for numerical work, with the latest international standard released in 2004. A Computer Languages History graphic shows a timeline from FORTRAN in 1954.

Shortly after, Lisp was introduced. Lisp was based on lambda calculus, and is far more regular in its syntax than most non-Lisp derived languages.

Dennis Ritchie developed the C programming language, initially for DEC PDP-11 in 1970.

During the 1970s, Xerox PARC developed Smalltalk, an object oriented language.

Based on the development of Smalltalk and other object oriented languages, Bjarne Stroustrup developed a programming language based on the syntax of C, called C++ in 1985.

Sun Microsystems released Java in 1995 which became very popular as an introductory programming language taught in universities. Microsoft presented the C# programming language in 2001 which is very similar to C++ and Java. There are many, many other languages (cf. List of programming languages).

Classifications of programming languages

Array
Aspect-oriented
Assembly
Concatenative
Concurrent
Curly-bracket
Data-structured
Dataflow
Declarative
Domain-specific
Dynamic
Educational
Esoteric
Functional
General-purpose
Imperative
Interface description
Logic
Multiparadigm
Object-oriented
- Prototype-based
Pattern directed invocation
Procedural
Quantum
Reflective
Scripting
Synchronous
Visual
Lists of all programming languages
- Alphabetical
- Categorical
- Chronological
- Generational

Formal semantics

The rigorous definition of the meaning of programming languages is the subject of formal semantics.

Notes

External links

Syntax Patterns for Various Languages
99 Bottles of Beer - a collection of implementations in many languages
History of Programming Languages (HOPL)
Open Directory - Computer Programming Languages
Dictionary of Programming Languages
Programming Language Comparison
Which Programming Language is right for you?

Template:Major programming languages small

Template:Link FA Template:Link FA Template:Link FA

[2] Unsafe operations are well isolated by a "Unchecked_" prefix.

[4] with optional dynamic type casting (see dynamic cast)

[6] with optional dynamic type (see dynamic member lookup)

[r2-8] It is almost safe, unsafe features are not commonly used.

[10] Optionally, typing can be explicitly implied by the first letter of the identifier (known as implicit typing within the Fortran community).

[16] ynamic checking of type extensions i.e. inherited types

[20] xplicit for static types

[22] tional for formal and virtual procedures

[1] As of March 2006 The Encyclopedia of Computer Languages by Murdoch University, Australia lists 8276 computer languages.

[3] Aldor User Guide (PDF). Aldor.org. 2002. pp. 40, 61. Retrieved 3 June 2021.

[5] ttps://learn.microsoft.com/en-us/dotnet/csharp/fundamentals/types/

[7] "Type Inference Crystal". Crystal Language Reference. Retrieved 3 June 2021.

[9] "The Dart type system". dart.dev. Retrieved 2020-04-08.

[11] The Go Programming Language Specification

[12] Löh, Andres. "Why does Haskell not have records with structural typing?". Stack Overflow. Archived from the original on 2016-03-14. Retrieved 2020-04-13.

[:0-13] King, Alexis (2020-01-19). "No, dynamic type systems are not inherently more open". lexi-lambda.github.io. Archived from the original on 2020-03-01. Retrieved 2020-04-13.

[14] Sheng Liang, Gilad Bracha. Dynamic class loading in the Java virtual machine. Volume 33, Issue 10 of ACM SIGPLAN Notices, October 1998.

[15] "Types · the Julia Language". Archived from the original on 2018-07-24. Retrieved 2018-07-24.

[17] Developer.apple.com Archived June 10, 2009, at the Wayback Machine

[18] "Record · Reason". reasonml.github.io. Archived from the original on 2020-03-31. Retrieved 2020-04-13.

[19] "Structural type system", Wikipedia, 2019-03-29, retrieved 2020-04-13

[21] "rust-lang/rustc-dev-guide". GitHub. Retrieved 2020-04-08.

[1]

[TS 1]

[2]

[TS 2]

[3]

[TS 3]

[4]

[TS 4]

[5]

[TS 5]

[6]

[7]

[8]

[9]

[10]

[TS 6]

[11]

[12]

[13]

[TS 7]

[14]

[TS 8]

Definitions of programming language

Specification

Features of programming language

Type system

Type system cross reference list

Notes

References

Data structures

Instruction and control flow

Design philosophy

Compilation and interpretation

History of programming languages

Classifications of programming languages

Formal semantics

See also

Notes

External links