Jump to content

Transcription

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dnjansen (talk | contribs) at 00:42, 4 January 2003. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Transcription is the conversion of the spoken word into the written language. This may be the transcription of a complete conversation, e. g., the proceedings of a court hearing, or of a single word.

In the latter case, transcription is the process of matching the sounds of human speech to written symbols using a set of standard rules, so that these sounds can be reproduced later. Usually these rules are organized on a phonetic basis and are specifically constructed in order to be maximally simple. Standard transcription schemes include the International Phonetic Alphabet (IPA), and its ASCII equivalent, SAMPA. One can see numerous examples of transcription on the Common phrases in different languages page (in this particular case, using the standard English spelling rules).

Specialised sense: Transcription from one language to another

In a more specialised sense, a transcription is (a system of) writing the sounds of a word in one language using the script of another language. Any reader of the latter language should be able to pronounce the transcribed word (almost) correctly. As the word may contain sounds that are unknown in the latter language, this goal is not always reached completely.

Transcriptions are used to write for the general public. For example, a newspaper; a general-purpose encyclopedia.

Transcription should be distinguished from transliteration in a narrow sense, q. v. However, transcription is sometimes also called transliteration.

The same words are likely to be transcribed differently under different systems. For example, for two transcription systems for Mandarin using the Roman alphabet, the Chinese capital is: Wade-Giles transcribes to Peking, while Pinyin transcribes to Beijing. See also transcription of Chinese, transcription of Russian.

Example:

Russian text Бopuc Huкoлaeвuч Eлцuн
Typical transliteration Boris Nikolaevič Elcin
English transcription Boris Nikolayevitch Yeltsin
French transcription Boris Nikolaïevitch Ieltsine
German transcription Boris Nikolajewitsch Jelzin
Italian transcription Boris Nikolaevic Eltsin
Dutch transcription Boris Nikolajewitsj Jeltsin

Transcription can be done into a non-alphabetic language too. For example, in a Beijing Newspaper, president Bush's name is transliterated into two Chinese characters that sounds like "Bu4 Shu1" (布殊) by using the characters that mean cloth and weird.

After transcribing

After transcribing a word from one language to the script of another language:

  • one or both languages may develop further. The original correspondence between the sounds of the two languages may change, and so the pronunciation of the transcribed word develops in a different direction than the original pronunciation.
  • the transcribed word may be adopted as a loan word in another language with the same script. This often leads to a different pronunciation and spelling than a direct transcription.

Especially evident is this for Greek loan words and proper names. Greek words are normally first transcribed to Latin (according to their old pronunciations), and then loaned into other languages, and finally the loan word has developed according to the rules of the goal language. For example, Aristotle is the currently used English form of the name of the philosopher whose name in Greek is spelled  ̓Aριστoτέλης (Aristotélēs), which was transcribed to Latin Aristóteles, from where it was loaned into other languages and followed their linguistic development. (In "classical" Greek of Aristotle's time, lower-case letters were not used, and the name was spelled ΑΡΙΣΤΟΤΕΛΗΣ.)

Pliocene comes from the Greek words πλεῖov (pleîon, "more") and καιvóς (kainós, "new"), which were first transcribed (latinised) to plion and caenus and then loaned into other languages. The historising latinisation of <κ> by <c> refers to the times where Latin pronounced <c> as [k] in all contexts.

When this process continues over several languages, it may fail miserably in conveying the original pronunciation. One ancient example is the Sanskrit word Channa which transcribed into the Chinese word Ch'an through buddhist scriptures. Ch'an (禪 Zen buddhism) was transcribed from Japanese to Zen in English. Channa to Zen is quite a change.

Another complex problem is the subsequent change in "preferred" transcription. For instance, the word describing a philosophy or religion in China was popularized in English as Tao and given the termination -ism to produce an English word Taoism. That transcription reflects the Wade-Giles system. More recent Pinyin transliterations produce Dao and Daoism. (See also Daoism versus Taoism.)

Transcription is the process of copying DNA to mRNA by an enzyme called RNA polymerase (RNAP). Transcription is the first step of protein biosynthesis.

Bacterial transcription

A (simple) model for a bacterial gene to be transcribed looks like this :

  upstream        ~17 bp       The gene to transcribe     downstream
5'----------|-35|---------|-10|----------------------|T|------------3'
3'----------|-35|---------|-10|----------------------|T|------------5'
                               |
                               |--------------------->
                                        mRNA

where the region between -35 and -10 base pairs is called promoter, and |T| stands for terminator. The DNA between promoter and terminator is copied to mRNA, which is then translated into protein.

Promoters can differ in strength, that is, how attractive they are for RNAP. The more similar they are to a consensus sequence, the stronger they are. The "ideal" promoter in E. coli looks like this:

5'----TTGACA---|17 bp|----TATAAT---|7bp|---|[[purine]]s|----3'

Initiation

The RNA polymerase holoenzyme consists of a core, made of four subunits (????'), and the ?-factor. The followings steps occur upon initiation:

  1. The RNAP recognizes the promoter region of the gene and binds to the DNA at that specific location. At this stage, the DNA is still double-stranded and called closed complex.
  2. The DNA is unwound and becomes single-stranded at the initiation site (the -10 promoter region). This is called open complex.
  3. The DNA is melted (the strands are locally separated), the ?-factor leaves the holoenzyme, and the transcription process begins. This is the elongation phase.

Elongation

The RNAP runs along the DNA, synthesizing mRNA in the process. In bacteria, the nascending mRNA is processed right away by ribosomes.

Termination

The elongation stops if:

  • The terminator is reached. The terminator is usually a palindromic DNA sequence that forms a hairpin.
  • A ? factor (a protein) binds and runs along the mRNA towards the RNAP. When ?-factor reaches the RNAP, it causes RNAP to dissociate from the DNA, terminating transcription.
  • The RNAP comes across a region with repetitious base pairs (for example, TTTTTT). This will terminate transcription.

Eukaryotic transcription

Gene expression in eukaryotes is largely controlled by transcription via transcription factors. As eukaryotes are much more complex than prokaryotes, and have their genetic material stored in the nucleus, the transcription mechanisms are more complicated here. For example, eukaryotes have three RNA polymerases, in contrast to prokaryotes, which only have one.

  • RNA Polymerase I is located in the nucleolus and transcribes only rRNAs.
  • RNA Polymerase II is the "standard" RNAP.
  • RNA Polymerase III transcribes tRNAs and other small RNAs.

Also, eukrayotic RNAPs need specific accessory proteins to become active. The C-terminus of all RNAPs is highly conserved and contains the actual transctiptional mechanism.

Initiation

The core promoter of eukaryotic genes stretches from position -45 to 0. Additionally, there can be an upstream control element present at the -180 to -107 region, which can amplify the RNAP binding by a factor of up to 100. This UCE usually contains a TATA box, a highly conserved DNA sequence that reads

T A T A T/A A

A similar sequence, thus not that highly conserved, is found in the INR element (initiator element, part of the complex core promoter).

Elongation

Termination

A major difference between prokaryotic and eukaryotic transcription is that the latter have splicing of the primary transcript, modifying the mRNA created during transcription.

See also: