<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Eoin Farrell</title><description>Astrophysics PhD &amp; AI Researcher</description><link>https://efarrell1.github.io/</link><item><title>Impact of binary interaction on the evolution of blue supergiants</title><link>https://efarrell1.github.io/blog/bsg_binaries/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/bsg_binaries/</guid><description>Published in Farrell et al. (2019), Astronomy &amp; Astrophysics, Volume 621</description><pubDate>Tue, 01 Jan 2019 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Numerical experiments to help understand cause and effect in massive star evolution</title><link>https://efarrell1.github.io/blog/cause_and_effect_massive_stars/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/cause_and_effect_massive_stars/</guid><description>Published in Farrell et al. (2022), Monthly Notices of the Royal Astronomical Society, Volume 512, Issue 3</description><pubDate>Sun, 01 May 2022 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Induction head circuits for longer sequences</title><link>https://efarrell1.github.io/blog/generalised_induction/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/generalised_induction/</guid><description>In a 2-layer attention-only transformer model, an induction head can combine with an &quot;averaging&quot; head that stores some kind of average over the previous ~4-5 tokens to produce a circuit that can predict the next token in repeated sequences of length 2 to 5 .</description><pubDate>Tue, 03 Oct 2023 00:00:00 GMT</pubDate><category>ai</category></item><item><title>Is GW190521 the merger of black holes from the first stellar generations? </title><link>https://efarrell1.github.io/blog/gw190521/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/gw190521/</guid><description>Published in Farrell et al. (2021), Monthly Notices of the Royal Astronomical Society: Letters, Volume 502, Issue 1</description><pubDate>Mon, 01 Mar 2021 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>The Initial Magnetic Field Distribution in AB Stars</title><link>https://efarrell1.github.io/blog/ifd_ab_stars/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/ifd_ab_stars/</guid><description>Published in Farrell et al. (2022), The Astrophysical Journal, Volume 938, Issue 1</description><pubDate>Sat, 01 Oct 2022 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Experiments with an alternative method to promote sparsity in sparse autoencoders</title><link>https://efarrell1.github.io/blog/l0_loss_function/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/l0_loss_function/</guid><description>I experimented with alternatives to the standard L1 penalty used to promote sparsity in sparse autoencoders (SAEs). I found that including terms based on an alternative differentiable approximation of the feature sparsity in the loss function was an effective way to generate sparsity in SAEs trained on the residual stream of GPT2-small.</description><pubDate>Tue, 16 Apr 2024 00:00:00 GMT</pubDate><category>ai</category></item><item><title>A method for non-linear inversion of the stellar structure applied to gravity-mode pulsators</title><link>https://efarrell1.github.io/blog/non_linear_stellar_inversions/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/non_linear_stellar_inversions/</guid><description>Published in Farrell et al. (2024) Astronomy &amp; Astrophysics, Volume 686</description><pubDate>Thu, 18 Apr 2024 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Positional Embeddings in a 2-layer attention-only transformer model</title><link>https://efarrell1.github.io/blog/pos-embeddings-two_layer/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/pos-embeddings-two_layer/</guid><description>This post contains some visualisations and discussion of positional embeddings. The position embeddings in a 2-layer attention-only transformer model arrange themselves into a helical structure. This presumably allows the model to generate QK matrices to move a few positions in relative terms with a similar transformation for all positions. The positional embeddings at positions 0 and 1023 have special properties.</description><pubDate>Sun, 01 Oct 2023 00:00:00 GMT</pubDate><category>ai</category></item><item><title>The previous token head and the &quot;look-back-two&quot; head </title><link>https://efarrell1.github.io/blog/previous-token-head/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/previous-token-head/</guid><description>A few plots on previous tokens heads, a discussion of how they work and a comparison to a similar type of attention head -- a &quot;look-back-two&quot; head.</description><pubDate>Mon, 02 Oct 2023 00:00:00 GMT</pubDate><category>ai</category></item><item><title>&apos;Recency bias&apos; in an induction head</title><link>https://efarrell1.github.io/blog/recency-bias-induction-head/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/recency-bias-induction-head/</guid><description>The induction head in a 2-layer attention-only transformer model has a slight bias towards tokens later in the context compared to earlier. Interestingly, its notion of position appears to not depend on positional embeddings, or any specific output from an attention head in the previous layer.</description><pubDate>Fri, 06 Oct 2023 00:00:00 GMT</pubDate><category>ai</category></item><item><title>The uncertain masses of progenitors of core-collapse supernovae and direct-collapse black holes</title><link>https://efarrell1.github.io/blog/rsg_masses/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/rsg_masses/</guid><description>Published in Farrell et al. (2020), Monthly Notices of the Royal Astronomical Society, Volume 494, Issue 1</description><pubDate>Fri, 01 May 2020 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Experiments with Sparse Autoencoders on Attention Heads</title><link>https://efarrell1.github.io/blog/sae_attention_heads/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/sae_attention_heads/</guid><description>I trained sparse autoencoders on the key and query vectors of previous token heads and induction heads of attn-only-2l and gpt2-small and found interpretable features which I could intervene on in a predictable and interpretable way.</description><pubDate>Thu, 21 Dec 2023 00:00:00 GMT</pubDate><category>ai</category></item><item><title>SNAPSHOT: connections between internal and surface properties of massive stars</title><link>https://efarrell1.github.io/blog/snapshot/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/snapshot/</guid><description>Published in Farrell et al. (2020), Monthly Notices of the Royal Astronomical Society Volume 495, Issue 4 </description><pubDate>Wed, 01 Jul 2020 00:00:00 GMT</pubDate><category>astrophysics</category></item><item><title>Unlearning with sparse autoencoders</title><link>https://efarrell1.github.io/blog/unlearning_harry_potter/</link><guid isPermaLink="true">https://efarrell1.github.io/blog/unlearning_harry_potter/</guid><description>We trained sparse autoencoders on the open-source language model Pythia-2.8b to use for unlearning Harry Potter related knowledge. We can successfully unlearn significant levels of Harry Potter related knowledge with little to no side effects. This technique is worth exploring further.</description><pubDate>Mon, 03 Jun 2024 00:00:00 GMT</pubDate><category>ai</category></item></channel></rss>