❤️ 4 ∀

Sharing ideas I love with the world. I will use this blog to cover ideas like intelligence, causality, agency, consciousness - awareness and self-models - in the hope of creating greater meaning through discussions with you. These are important ideas to discuss for safety reasons. AI today feels like driving a fast Ferrari at night without lights. My hope is to shed some light so we can anticipate and avert risks. We must not be afraid to research these topics. The research is needed to prevent the accidental, uncontrolled creation of self-aware machines. I would also like to focus on positive strategies like using AI to design materials as cusp.ai is championing, proteins and drugs, nano-machines to prevent the catastrophic destruction of our environment and to create real wealth for everyone, e.g. via carbon capture or via energy creation and storage. Much of this will also be about the extended-mind approach of Clark, Chalmers and Dennett of building AI that harnesses all existing tools, especially scientific and engineering tools, to advance science, mathematics, health, and technology beyond what we humans are capable of.

Browse entries

Latest entries

All entries

PDFNotebookTeX

Continual, interactive, causal agents

Modern languagemodel agents are usually built by stacking separate training regimes: pretraining, midtraining, supervised finetuning, preference modeling, rejection sampling, reinforcement learning, reasoningspecific tu…

Read PDF Download notebook Download TeX source

Read entry →

PDFNotebookTeX

Emergent reward maximization

Can an interactional imitation learner, trained without scalar reward labels, recover behavior that is equivalent to expected reward maximization purely from worldwritten preference evidence? The answer as shown here is…

Read PDF Download notebook Download TeX source

Read entry →

PDFNotebookTeX

Why it is important to understand causality and agency

Large language models are increasingly deployed as agents: They call tools, follow instructions, and act on behalf of users in multiturn loops. Yet selfimprovement and industrial flywheel finetuning recipes still treat…

Read PDF Download notebook Download TeX source

Read entry →

PDFNotebookTeX

Intelligence via generation and selection: A tutorial on reinforcement learning with LLMs and tools

The central theme of this note is that intelligent systems become powerful when they can both generate candidate behaviours and select among them. Supervised learning corresponds to the most trivial form of imitation: m…

Read PDF Download notebook Download TeX source

Read entry →

PDFNotebookTeX

Diffusion and flow matching tutorial

Diffusion and flow matching are the standard ways for generating images, video, speech, music and even protein structures and molecular simulations. The application to science, in particular to molecular design, is fasc…

Read PDF Download notebook Download TeX source

Read entry →