The Quadrilingual Probe: How Aquamosh (1998) Falsifies the Distributional Hypothesis Across Five Embedding Architectures

Abstract This research uses Aquamosh (1998), the quadrilingual debut album by Plastilina Mosh (Spanish, English, French, Japanese; produced by Tom Rothrock and Rob Schnapf — Beck’s Odelay team), as an empirical falsification probe for distributional sentence embeddings. The album’s quadrilingual structure converts code-switching from anecdotal concern into a quantitative experiment: every language transition is a guaranteed lexical discontinuity, allowing us to dissociate topical continuity from surface form. Core Finding (CONFIRMED): In all five sentence-embedding architectures probed — OpenAI text-embedding-3-large (3072-dim, decoder), Google LaBSE (768-dim, encoder, parallel-corpus), BAAI BGE-M3 (1024-dim), multilingual-E5-large (1024-dim), and paraphrase-multilingual-MPNet (768-dim) — a language switch in consecutive lyric lines approximately doubles the probability of “window break” (the embedding similarity falling below a calibrated coherence threshold). Mean relative gap across models: 1.69×; range: 1.31× (E5) to 1.94× (OpenAI). Permutation tests against H₀ of language-rupture independence reject with z = +6.54 (OpenAI), z = +4.51 (LaBSE), both p < 10⁻⁴ over 10,000 simulations. Logistic regression with GEE clustered by track and controls for line position and anchor/successor languages yields OR = 3.99 [2.51, 6.36] for OpenAI (p < 0.001), OR = 2.52 [1.39, 4.57] for LaBSE (p = 0.002). LLM-as-judge against GPT-4o-mini shows OpenAI declares “rupture” while a sophisticated reader sees continuity 3.18× more often in switches than in same-language transitions (false-break rate 0.060 vs 0.191). ...

May 20, 2026 · Carlos Daniel Jiménez

Attention Windows: A Novel Framework for Measuring Narrative Cognitive Load in Beatles vs Pink Floyd

Abstract This research introduces Attention Windows, a novel framework for measuring the cognitive span required by listeners to follow lyrical narratives. How long can a theme persist before the lyrics shift to something new? Building on previous semantic embedding analyses of the Beatles and Pink Floyd, we develop a multi-method approach to quantify this narrative architecture across two iconic albums: The Dark Side of the Moon and Abbey Road. Core Finding (UNEXPECTED): The analysis reveals a systematic failure of distributional semantics to capture abstract thematic coherence in progressive rock. The Beatles exhibit significantly longer attention windows (μ = 0.57 lines, SD = 1.48) than Pink Floyd (μ = 0.25 lines, SD = 0.97) when measured with OpenAI’s text-embedding-ada-002 at its calibrated threshold (θ = 0.85). This counterintuitive result (p < 0.01, Cohen’s d = -0.24) exposes a fundamental limitation: transformer-based embeddings, trained on distributional statistics from web corpora, systematically privilege type-level lexical overlap (repeated tokens, n-grams) over token-level conceptual continuity (abstract themes expressed through synonymy, metaphor, and semantic field variation). The Beatles’ verse-chorus architecture creates high embedding similarity through verbatim repetition, while Pink Floyd’s through-composed approach—deploying varied metaphorical expressions of unified philosophical themes—produces orthogonal embedding vectors despite conceptual unity. This is not a quirk of ada-002 but a structural property of distributional semantics: co-occurrence statistics cannot distinguish “same theme, different words” from “different themes, same words.” ...

February 10, 2026 · Carlos Daniel Jiménez

Literary Mapping of Christmas Novels: A Vector Narrative Arc Approach

Post Objective Data cleaning and preliminary analysis process Understanding the emotional charge or plot development of texts through semantic archaeology based on PCAs Understanding the connections and most representative ideas within the document set Intention Understanding a story’s behavior at the level of its variance is a challenge addressed by attentional engineering. Therefore, using lesser-known methods such as the vector narrative arc combined with a literary map constitutes an interesting route to address increasingly common problems. ...

January 7, 2026 · Carlos Daniel Jiménez

MLflow for Generative AI Systems

MLflow for Generative AI Systems I’ll start this post by recalling what Hayen said in her book Designing Machine Learning Systems (2022): ‘Systems are meant to learn’. This statement reflects a simple fact: today, LLMs and to a lesser extent vision language models are winning in the Data Science world. But how do we measure this learning? RLHF work is always a good indicator that perplexity will improve, but let’s return to a key point: LLMs must work as a system, therefore debugging is important, and that’s where the necessary tool for every Data Scientist, AI Engineer, ML Engineer, and MLOps Engineer comes in: MLflow. ...

October 8, 2025 · Carlos Daniel Jiménez

Raspberry Pi 16GB, Servers, and MLOps

Less than two months ago, the most powerful version of the Raspberry Pi 5 hit the market, featuring 16GB of RAM. While its price ($120 USD) is a valid discussion point, as someone who uses these devices as servers for deployment testing and efficiency evaluation at the code level, I want to explore its utility from a computer science perspective in the context of MLOps and LLMs testing. Raspberry Pi Utility Let’s start with some common applications to build on ideas: ...

March 10, 2025 · Carlos Daniel Jiménez

Get new essays on AI software engineering, LLMOps, edge systems, and music analysis.

One useful note at a time. No growth hacks, no filler, no course funnel. Just careful writing on how AI systems are built and where they fail.