Pitch detection algorithm - Revision history

98.180.42.203: update URL for pitch extraction paper

2024-08-14T13:56:22Z

update URL for pitch extraction paper

← Previous revision		Revision as of 13:56, 14 August 2024
Line 3:		Line 3:
	A '''pitch detection algorithm''' ('''PDA''') is an [[algorithm]] designed to estimate the [[pitch (music)\|pitch]] or [[fundamental frequency]] of a [[quasiperiodic]] or [[oscillation\|oscillating]] signal, usually a [[digital recording]] of [[speech processing\|speech]] or a musical note or tone. This can be done in the [[time domain]], the [[frequency domain]], or both.		A '''pitch detection algorithm''' ('''PDA''') is an [[algorithm]] designed to estimate the [[pitch (music)\|pitch]] or [[fundamental frequency]] of a [[quasiperiodic]] or [[oscillation\|oscillating]] signal, usually a [[digital recording]] of [[speech processing\|speech]] or a musical note or tone. This can be done in the [[time domain]], the [[frequency domain]], or both.

	PDAs are used in various contexts (e.g. [[phonetics]], [[music information retrieval]], [[speech coding]], [[musical performance system]]s) and so there may be different demands placed upon the algorithm. There is as yet{{When\|date=October 2018}} no single ideal PDA, so a variety of algorithms exist, most falling broadly into the classes given below.<ref>D. Gerhard. [~~http~~://~~www~~.cs.uregina.ca/~~Research~~/~~Techreports~~/~~2003~~-06.pdf Pitch Extraction and Fundamental Frequency: History and Current Techniques], technical report, Dept. of Computer Science, University of Regina, 2003.</ref>		PDAs are used in various contexts (e.g. [[phonetics]], [[music information retrieval]], [[speech coding]], [[musical performance system]]s) and so there may be different demands placed upon the algorithm. There is as yet{{When\|date=October 2018}} no single ideal PDA, so a variety of algorithms exist, most falling broadly into the classes given below.<ref>D. Gerhard. [https://www2.cs.uregina.ca/~gerhard/publications/TRdbg-Pitch.pdf Pitch Extraction and Fundamental Frequency: History and Current Techniques], technical report, Dept. of Computer Science, University of Regina, 2003.</ref>

	A PDA typically estimates the period of a quasiperiodic signal, then inverts that value to give the frequency.		A PDA typically estimates the period of a quasiperiodic signal, then inverts that value to give the frequency.

Citation bot: Add: s2cid, bibcode, pmid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Abductive | Category:Audio engineering | #UCB_Category 57/240

2024-01-08T19:35:50Z

Add: s2cid, bibcode, pmid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Abductive | Category:Audio engineering | #UCB_Category 57/240

← Previous revision		Revision as of 19:35, 8 January 2024
Line 12:		Line 12:
	More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}		More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}

	Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>{{cite journal \| ~~last~~=de Cheveigné \| ~~first~~=Alain \| last2=Kawahara \| first2=Hideki \| title=YIN, a fundamental frequency estimator for speech and music \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=111 \| issue=4 \| year=2002 \| issn=0001-4966 \| doi=10.1121/1.1458024 \| pages=1917–1930\|url=http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].		Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>{{cite journal \| last1=de Cheveigné \| first1=Alain \| last2=Kawahara \| first2=Hideki \| title=YIN, a fundamental frequency estimator for speech and music \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=111 \| issue=4 \| year=2002 \| issn=0001-4966 \| doi=10.1121/1.1458024 \| pages=1917–1930\| pmid=12002874 \| bibcode=2002ASAJ..111.1917D \| s2cid=1607434 \|url=http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].

	==Frequency-domain approaches==		==Frequency-domain approaches==
Line 23:		Line 23:

	==Spectral/temporal approaches==		==Spectral/temporal approaches==
	Spectral/temporal pitch detection algorithms, e.g. the [[YAAPT pitch tracking algorithm]],<ref>{{cite journal \| ~~last~~=Zahorian \| ~~first~~=Stephen A. \| last2=Hu \| first2=Hongbing \| title=A spectral/temporal method for robust fundamental frequency tracking \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=123 \| issue=6 \| year=2008 \| issn=0001-4966 \| doi=10.1121/1.2916590 \| pages=4559–4571\|url=http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.		Spectral/temporal pitch detection algorithms, e.g. the [[YAAPT pitch tracking algorithm]],<ref>{{cite journal \| last1=Zahorian \| first1=Stephen A. \| last2=Hu \| first2=Hongbing \| title=A spectral/temporal method for robust fundamental frequency tracking \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=123 \| issue=6 \| year=2008 \| issn=0001-4966 \| doi=10.1121/1.2916590 \| pages=4559–4571\| pmid=18537404 \| bibcode=2008ASAJ..123.4559Z \|url=http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.

	==Speech pitch detection==		==Speech pitch detection==

Bjorn fiedelson: /* Frequency-domain approaches */

2022-09-22T01:27:05Z

Frequency-domain approaches

← Previous revision		Revision as of 01:27, 22 September 2022
Line 16:		Line 16:
	==Frequency-domain approaches==		==Frequency-domain approaches==
	Frequency domain, polyphonic detection is possible, usually utilizing the [[periodogram]] to convert the signal to an estimate of the [[frequency spectrum]]<ref>{{cite book \|title=Statistical Digital Signal Processing and Modeling \|last=Hayes \|first=Monson \|year=1996 \|publisher=John Wiley & Sons, Inc. \|isbn=0-471-59431-8 \|page=393}}</ref>		Frequency domain, polyphonic detection is possible, usually utilizing the [[periodogram]] to convert the signal to an estimate of the [[frequency spectrum]]<ref>{{cite book \|title=Statistical Digital Signal Processing and Modeling \|last=Hayes \|first=Monson \|year=1996 \|publisher=John Wiley & Sons, Inc. \|isbn=0-471-59431-8 \|page=393}}</ref>
	. This requires more processing power as the desired accuracy increases, although the well-known efficiency of the [[Fast Fourier transform\|FFT]], a key part of the [[periodogram]] algorithm, makes it suitably efficient for many purposes.		. This requires more processing power as the desired accuracy increases, although the well-known efficiency of the [[Fast Fourier transform\|FFT]], a key part of the periodogram algorithm, makes it suitably efficient for many purposes.

	Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. 779–797.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>		Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. 779–797.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>

2A00:23C4:AF34:4E01:694A:53C8:9935:6219: /* Spectral/temporal approaches */ YAAPT pitch tracking

2022-08-12T22:12:46Z

Spectral/temporal approaches: YAAPT pitch tracking

← Previous revision		Revision as of 22:12, 12 August 2022
Line 23:		Line 23:

	==Spectral/temporal approaches==		==Spectral/temporal approaches==
	Spectral/temporal pitch detection algorithms, e.g. the YAAPT pitch tracking,<ref>{{cite journal \| last=Zahorian \| first=Stephen A. \| last2=Hu \| first2=Hongbing \| title=A spectral/temporal method for robust fundamental frequency tracking \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=123 \| issue=6 \| year=2008 \| issn=0001-4966 \| doi=10.1121/1.2916590 \| pages=4559–4571\|url=http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.		Spectral/temporal pitch detection algorithms, e.g. the [[YAAPT pitch tracking algorithm]],<ref>{{cite journal \| last=Zahorian \| first=Stephen A. \| last2=Hu \| first2=Hongbing \| title=A spectral/temporal method for robust fundamental frequency tracking \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=123 \| issue=6 \| year=2008 \| issn=0001-4966 \| doi=10.1121/1.2916590 \| pages=4559–4571\|url=http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.

	==Speech pitch detection==		==Speech pitch detection==

Fgnievinski: /* See also */

2022-02-21T01:33:33Z

Rlink2: /* Frequency-domain approaches /archive link repair, may include: archive. -> archive.today, and http->https for ghostarchive.org and archive.org (wp:el#Specifying_protocols)

2021-12-03T04:54:04Z

Frequency-domain approaches: archive link repair, may include: archive.* -> archive.today, and http->https for ghostarchive.org and archive.org (wp:el#Specifying_protocols)

← Previous revision		Revision as of 04:54, 3 December 2021
Line 20:		Line 20:
	Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. 779–797.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>		Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. 779–797.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>

	To improve on the pitch estimate derived from the discrete Fourier spectrum, techniques such as [[Reassignment method\|spectral reassignment]] (phase based) or [[Grandke interpolation]] (magnitude based) can be used to go beyond the precision provided by the FFT bins. Another phase-based approach is offered by Brown and Puckette <ref>Brown JC and Puckette MS (1993). A high resolution fundamental frequency determination based on phase changes of the Fourier transform. J. Acoust. Soc. Am. Volume 94, Issue 2, pp. 662–667 [https://archive.is/20130414073448/http://asadl.org/jasa/resource/1/jasman/v94/i2/p662_s1?isAuthorized=no ]</ref>		To improve on the pitch estimate derived from the discrete Fourier spectrum, techniques such as [[Reassignment method\|spectral reassignment]] (phase based) or [[Grandke interpolation]] (magnitude based) can be used to go beyond the precision provided by the FFT bins. Another phase-based approach is offered by Brown and Puckette <ref>Brown JC and Puckette MS (1993). A high resolution fundamental frequency determination based on phase changes of the Fourier transform. J. Acoust. Soc. Am. Volume 94, Issue 2, pp. 662–667 [https://archive.today/20130414073448/http://asadl.org/jasa/resource/1/jasman/v94/i2/p662_s1?isAuthorized=no ]</ref>

	==Spectral/temporal approaches==		==Spectral/temporal approaches==

Forbes72: improve some existing references

2021-11-11T04:18:06Z

improve some existing references

← Previous revision		Revision as of 04:18, 11 November 2021
Line 12:		Line 12:
	More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}		More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}

	Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>A. de Cheveigné ~~and~~ H. Kawahara. ~~[http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf~~ YIN, a fundamental frequency estimator for speech and music.] The Journal of the Acoustical Society of America, ~~Vol.~~ 111, ~~No.~~ 4, ~~April~~ 2002. ~~{{doi~~\|10.1121/1.1458024}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].		Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>{{cite journal \| last=de Cheveigné \| first=Alain \| last2=Kawahara \| first2=Hideki \| title=YIN, a fundamental frequency estimator for speech and music \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=111 \| issue=4 \| year=2002 \| issn=0001-4966 \| doi=10.1121/1.1458024 \| pages=1917–1930\|url=http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].

	==Frequency-domain approaches==		==Frequency-domain approaches==
Line 23:		Line 23:

	==Spectral/temporal approaches==		==Spectral/temporal approaches==
	Spectral/temporal pitch detection algorithms, e.g. the YAAPT pitch tracking,<ref>Stephen A. ~~Zahorian~~ ~~and~~ Hongbing ~~Hu. [http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf~~ A ~~Spectral~~/temporal method for ~~Robust~~ ~~Fundamental~~ ~~Frequency~~ ~~Tracking.]~~ The Journal of the Acoustical Society of America, ~~123~~ (6), 2008. ~~{{doi~~\|10.1121/1.2916590}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.		Spectral/temporal pitch detection algorithms, e.g. the YAAPT pitch tracking,<ref>{{cite journal \| last=Zahorian \| first=Stephen A. \| last2=Hu \| first2=Hongbing \| title=A spectral/temporal method for robust fundamental frequency tracking \| journal=The Journal of the Acoustical Society of America \| publisher=Acoustical Society of America (ASA) \| volume=123 \| issue=6 \| year=2008 \| issn=0001-4966 \| doi=10.1121/1.2916590 \| pages=4559–4571\|url=http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf}}</ref><ref>Stephen A. Zahorian and Hongbing Hu. [http://ws2.binghamton.edu/zahorian/yaapt.htm YAAPT Pitch Tracking MATLAB Function]</ref> are based upon a combination of time domain processing using an [[autocorrelation]] function such as normalized cross correlation, and frequency domain processing utilizing spectral information to identify the pitch. Then, among the candidates estimated from the two domains, a final pitch track can be computed using [[dynamic programming]]. The advantage of these approaches is that the tracking error in one domain can be reduced by the process in the other domain.

	==Speech pitch detection==		==Speech pitch detection==

Dmoore5556: /* top */ add short description and hatnote

2021-08-08T19:07:49Z

top: add short description and hatnote

← Previous revision		Revision as of 19:07, 8 August 2021
Line 1:		Line 1:
			{{short description\|Algorithm to estimate signal frequency}}
			{{redirect\|Pitch tracking\|the baseball term\|Glossary of baseball (P)#pitch tracking}}
	A '''pitch detection algorithm''' ('''PDA''') is an [[algorithm]] designed to estimate the [[pitch (music)\|pitch]] or [[fundamental frequency]] of a [[quasiperiodic]] or [[oscillation\|oscillating]] signal, usually a [[digital recording]] of [[speech processing\|speech]] or a musical note or tone. This can be done in the [[time domain]], the [[frequency domain]], or both.		A '''pitch detection algorithm''' ('''PDA''') is an [[algorithm]] designed to estimate the [[pitch (music)\|pitch]] or [[fundamental frequency]] of a [[quasiperiodic]] or [[oscillation\|oscillating]] signal, usually a [[digital recording]] of [[speech processing\|speech]] or a musical note or tone. This can be done in the [[time domain]], the [[frequency domain]], or both.

Xyzäöå: "–"

2021-01-10T14:42:34Z

"–"

← Previous revision		Revision as of 14:42, 10 January 2021
Line 16:		Line 16:
	. This requires more processing power as the desired accuracy increases, although the well-known efficiency of the [[Fast Fourier transform\|FFT]], a key part of the [[periodogram]] algorithm, makes it suitably efficient for many purposes.		. This requires more processing power as the desired accuracy increases, although the well-known efficiency of the [[Fast Fourier transform\|FFT]], a key part of the [[periodogram]] algorithm, makes it suitably efficient for many purposes.

	Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. ~~779-797~~.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. ~~293-309~~.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>		Popular frequency domain algorithms include: the [[harmonic product spectrum]];<ref name="cnxpda">[http://cnx.org/content/m11714/latest/ Pitch Detection Algorithms], online resource from [[OpenStax CNX\|Connexions]]</ref><ref>A. Michael Noll, “Pitch Determination of Human Speech by the Harmonic Product Spectrum, the Harmonic Sum Spectrum and a Maximum Likelihood Estimate,” Proceedings of the Symposium on Computer Processing in Communications, Vol. XIX, Polytechnic Press: Brooklyn, New York, (1970), pp. 779–797.</ref> [[cepstrum\|cepstral]] analysis<ref>A. Michael Noll, “[https://asa.scitation.org/doi/abs/10.1121/1.1910339 Cepstrum Pitch Determination],” Journal of the Acoustical Society of America, Vol. 41, No. 2, (February 1967), pp. 293–309.</ref> and [[maximum likelihood]] which attempts to match the frequency domain characteristics to pre-defined frequency maps (useful for detecting pitch of fixed tuning instruments); and the detection of peaks due to harmonic series.<ref>Mitre, Adriano; Queiroz, Marcelo; Faria, Régis. [http://www.ime.usp.br/~mqz/Mitre_AESBR2006.pdf Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates.] Proceedings of the 4th AES Brazil Conference. 113-118, 2006.</ref>

	To improve on the pitch estimate derived from the discrete Fourier spectrum, techniques such as [[Reassignment method\|spectral reassignment]] (phase based) or [[Grandke interpolation]] (magnitude based) can be used to go beyond the precision provided by the FFT bins. Another phase-based approach is offered by Brown and Puckette <ref>Brown JC and Puckette MS (1993). A high resolution fundamental frequency determination based on phase changes of the Fourier transform. J. Acoust. Soc. Am. Volume 94, Issue 2, pp. ~~662-667~~ [https://archive.is/20130414073448/http://asadl.org/jasa/resource/1/jasman/v94/i2/p662_s1?isAuthorized=no ]</ref>		To improve on the pitch estimate derived from the discrete Fourier spectrum, techniques such as [[Reassignment method\|spectral reassignment]] (phase based) or [[Grandke interpolation]] (magnitude based) can be used to go beyond the precision provided by the FFT bins. Another phase-based approach is offered by Brown and Puckette <ref>Brown JC and Puckette MS (1993). A high resolution fundamental frequency determination based on phase changes of the Fourier transform. J. Acoust. Soc. Am. Volume 94, Issue 2, pp. 662–667 [https://archive.is/20130414073448/http://asadl.org/jasa/resource/1/jasman/v94/i2/p662_s1?isAuthorized=no ]</ref>

	==Spectral/temporal approaches==		==Spectral/temporal approaches==

James Flatbottom III: /* General approaches */ Updated the YIN paper link.

2020-09-24T12:26:50Z

General approaches: Updated the YIN paper link.

← Previous revision		Revision as of 12:26, 24 September 2020
Line 10:		Line 10:
	More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}		More sophisticated approaches compare segments of the signal with other segments offset by a trial period to find a match. AMDF ([[average magnitude difference function]]), ASMDF (Average Squared Mean Difference Function), and other similar [[autocorrelation]] algorithms work this way. These algorithms can give quite accurate results for highly periodic signals. However, they have false detection problems (often "''octave errors''"), can sometimes cope badly with noisy signals (depending on the implementation), and - in their basic implementations - do not deal well with [[polyphony\|polyphonic]] sounds (which involve multiple musical notes of different pitches).{{Cn\|date=October 2018}}

	Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>A. de Cheveigné and H. Kawahara. [http://~~www~~.~~ircam~~.fr/~~pcm~~/~~cheveign/pss~~/2002_JASA_YIN.pdf YIN, a fundamental frequency estimator for speech and music.]~~{{dead link\|date=March 2018 \|bot=InternetArchiveBot \|fix-attempted=yes }}~~ The Journal of the Acoustical Society of America, 111~~:1917~~, 2002. {{doi\|10.1121/1.1458024}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].		Current{{When\|date=October 2018}} time-domain pitch detector algorithms tend to build upon the basic methods mentioned above, with additional refinements to bring the performance more in line with a human assessment of pitch. For example, the YIN algorithm<ref>A. de Cheveigné and H. Kawahara. [http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf YIN, a fundamental frequency estimator for speech and music.] The Journal of the Acoustical Society of America, Vol. 111, No. 4, April 2002. {{doi\|10.1121/1.1458024}}</ref> and the MPM algorithm<ref>P. McLeod and G. Wyvill. [http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf A smarter way to find pitch.] In Proceedings of the International Computer Music Conference (ICMC’05), 2005.</ref> are both based upon [[autocorrelation]].

	==Frequency-domain approaches==		==Frequency-domain approaches==

← Previous revision		Revision as of 01:33, 21 February 2022
Line 36:		Line 36:
	* [[Linear predictive coding]]		* [[Linear predictive coding]]
	* [[MUSIC (algorithm)]]		* [[MUSIC (algorithm)]]
			* [[Sinusoidal model]]

	==References==		==References==