Related changes
Appearance
Enter a page name to see changes on pages linked to or from that page. (To see members of a category, enter Category:Name of category). Changes to pages on your Watchlist are shown in bold with a green bullet. See more at Help:Related changes.
List of abbreviations (help):
- D
- Edit made at Wikidata
- r
- Edit flagged by ORES
- N
- New page
- m
- Minor edit
- b
- Bot edit
- (±123)
- Page byte size change
- Temporarily watched page
20 June 2025
- diffhist m Machine learning 00:51 0 Forever; wherever talk contribs (→physical neural networks: Capitalize sub-section title's first letter)
19 June 2025
- diffhist Mesa-optimization 23:47 −19 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:45 +4 Southernhemisphere talk contribs (→Concept and motivation)
- diffhist Mesa-optimization 23:27 +4 Southernhemisphere talk contribs (→Concept and motivation)
- diffhist Mesa-optimization 23:27 +8 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:26 −247 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:25 −9 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:25 +451 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:23 +384 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:21 +562 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:18 +1,396 Southernhemisphere talk contribs
- diffhist Mesa-optimization 23:17 +445 Southernhemisphere talk contribs
- diffhist N Mesa-optimization 23:15 +5,308 Southernhemisphere talk contribs (←Created page with ''''Mesa-optimization''' refers to a phenomenon in advanced machine learning where a model trained by an outer optimizer—such as stochastic gradient descent—develops into an optimizer itself, known as a ''mesa-optimizer''. Rather than merely executing learned patterns of behavior, the system actively optimizes for its own internal goals, which may not align with those intended by human designers. This raises significant concerns in the field of AI alig...')
- diffhist Goodhart's law 18:08 +29 DMacks talk contribs (→Priority and background: try to clarify some things)
- diffhist Machine learning 15:26 −405 MrOllie talk contribs (Reverted 1 edit by Manolochama (talk): Rv refspammer) Tags: Twinkle Undo
- diffhist m Machine learning 15:23 +405 Manolochama talk contribs (Quote about business application) Tag: Reverted
- Move log 08:26 Anohthterwikipedian talk contribs moved page Outer alignment (artificial intelligence) to Outer alignment over redirect (Perform technical move requested at WP:RM/TR (permalink): Requested by 89.243.41.89 at WP:RM/TR: disambiguation not needed)
- diffhist Outer alignment (artificial intelligence) 03:28 −2 Southernhemisphere talk contribs (→See also)
- diffhist Outer alignment (artificial intelligence) 03:27 +66 Southernhemisphere talk contribs (→See also)
18 June 2025
- diffhist Reward hacking 23:53 +40 Alenoach talk contribs (mentioning reinforcement learning in the first sentence) Tag: Visual edit
- diffhist Reward hacking 23:52 −36 Alenoach talk contribs (cleaning) Tag: Visual edit
- diffhist Outer alignment (artificial intelligence) 23:18 −249 Alenoach talk contribs (copyedit) Tag: Visual edit
- diffhist Outer alignment (artificial intelligence) 06:46 +75 Zero Contradictions talk contribs (Adding short description: "The challenge of training AI to reflect human values") Tag: Shortdesc helper
- diffhist Outer alignment (artificial intelligence) 02:51 −27 Southernhemisphere talk contribs (Fixed)
- diffhist Reward hacking 02:49 +64 Southernhemisphere talk contribs (→See also)
- diffhist AI alignment 02:46 +46 Southernhemisphere talk contribs
17 June 2025
- diffhist Outer alignment (artificial intelligence) 16:56 +8 Southernhemisphere talk contribs (→Broader context in AI alignment)
- diffhist Outer alignment (artificial intelligence) 16:56 +8 Southernhemisphere talk contribs (→Theoretical limits and decidable alignment)
- diffhist Outer alignment (artificial intelligence) 16:55 +8 Southernhemisphere talk contribs (→Systems-theoretic approaches)
- diffhist Outer alignment (artificial intelligence) 16:54 +8 Southernhemisphere talk contribs
- diffhist Outer alignment (artificial intelligence) 16:49 0 Southernhemisphere talk contribs
- diffhist Outer alignment (artificial intelligence) 16:47 −8 Southernhemisphere talk contribs
- diffhist N Outer alignment (artificial intelligence) 16:45 +9,947 Southernhemisphere talk contribs (←Created page with 'Outer alignment (artificial intelligence) Outer alignment is a concept in artificial intelligence (AI) safety that refers to the challenge of specifying training objectives for AI systems in a way that truly reflects human values and intentions. It is often described as the reward misspecification problem, as it concerns whether the goal provided during training actually captures what humans want the AI to accomplish.<ref>{{cite web |title=What is outer ali...')
- diffhist AI safety 12:54 0 2001:700:1500:c033:8000::f7 talk ("Verify credibility" rather than "unreliable source". There is no source for this statement.)
- diffhist AI safety 12:53 +23 2001:700:1500:c033:8000::f7 talk (Machine ethics has mostly existed independently from the AI Safety field. They have different goals and vastly different approaches. If this is false, then please provide a source.)
- diffhist AI alignment 12:50 +22 2001:700:1500:c033:8000::f7 talk (The source does not explicitly mention that AI alignment is a "subfield" of AI Safety. There are no other sources that say this. While the source mentioned here does imply that AI alignment goals are the same as AI Safety, they do not distinguish these terminologies. Whoever reverted this, please provide evidence that AI Alignment is indeed a subfield of AI Safety from an academic source.) Tag: Reverted
- diffhist AI alignment 04:12 −37 Alenoach talk contribs (The source does not adopt the term "AI alignment", but it describes the same concepts and says they are within AI safety, so I kept it) Tags: Manual revert Reverted Visual edit
16 June 2025
- diffhist Goodhart's law 16:14 +6 Johnjbarton talk contribs (→Examples: cn)
- diffhist Goodhart's law 16:13 −230 Johnjbarton talk contribs (→Examples: Delete WP:SELFPUB source "Originally published at www.roshanrevankar.com on September 14, 2014")
- diffhist Goodhart's law 16:13 −158 Johnjbarton talk contribs (→Examples: Delete source that does not seem to exist.)
- diffhist Goodhart's law 15:55 +121 Loodog talk contribs (→Examples: Medium doesn't qualify as a self-published source. Besides, they're hardly the first source to draw the connection to GL.)
- diffhist AI alignment 11:33 +37 2001:700:1500:c033:8000::f7 talk (Unreliable source. The reference does not mention AI alignment.) Tag: Reverted