Posts

Attention Windows: A Novel Framework for Measuring Narrative Cognitive Load in Beatles vs Pink Floyd

Abstract This research introduces Attention Windows, a novel framework for measuring the cognitive span required by listeners to follow lyrical narratives. How long can a theme persist before the lyrics shift to something new? Building on previous semantic embedding analyses of the Beatles and Pink Floyd, we develop a multi-method approach to quantify this narrative architecture across two iconic albums: The Dark Side of the Moon and Abbey Road. Core Finding (UNEXPECTED): The analysis reveals a systematic failure of distributional semantics to capture abstract thematic coherence in progressive rock. The Beatles exhibit significantly longer attention windows (μ = 0.57 lines, SD = 1.48) than Pink Floyd (μ = 0.25 lines, SD = 0.97) when measured with OpenAI’s text-embedding-ada-002 at its calibrated threshold (θ = 0.85). This counterintuitive result (p < 0.01, Cohen’s d = -0.24) exposes a fundamental limitation: transformer-based embeddings, trained on distributional statistics from web corpora, systematically privilege type-level lexical overlap (repeated tokens, n-grams) over token-level conceptual continuity (abstract themes expressed through synonymy, metaphor, and semantic field variation). The Beatles’ verse-chorus architecture creates high embedding similarity through verbatim repetition, while Pink Floyd’s through-composed approach—deploying varied metaphorical expressions of unified philosophical themes—produces orthogonal embedding vectors despite conceptual unity. This is not a quirk of ada-002 but a structural property of distributional semantics: co-occurrence statistics cannot distinguish “same theme, different words” from “different themes, same words.” ...

Anatomy of an MLOps Pipeline - Part 1: Pipeline and Orchestration

Complete MLOps Series: Part 1 (current) | Part 2: Deployment → | Part 3: Production → Anatomy of an MLOps Pipeline - Part 1: Pipeline and Orchestration Why This Post Is Not Another Scikit-Learn Tutorial Most MLOps posts teach you how to train a Random Forest in a notebook and tell you “now put it in production.” This post assumes you already know how to train models. What you probably don’t know is how to build a system where: ...

Anatomy of an MLOps Pipeline - Part 2: Deployment and Infrastructure

Complete MLOps Series: ← Part 1: Pipeline | Part 2 (current) | Part 3: Production → Anatomy of an MLOps Pipeline - Part 2: Deployment and Infrastructure 8. CI/CD with GitHub Actions: The Philosophy of Automated MLOps The Philosophical Foundation: Why Automation Isn’t Optional Before diving into YAML, let’s address the fundamental question: why do we automate ML pipelines? The naive answer is “to save time.” The real answer is more profound: because human memory is unreliable, manual processes don’t scale, and production systems demand reproducibility. ...

Anatomy of an MLOps Pipeline - Part 3: Production and Best Practices

Complete MLOps Series: ← Part 1: Pipeline | ← Part 2: Deployment | Part 3 (current) Anatomy of an MLOps Pipeline - Part 3: Production and Best Practices 11. Model and Parameter Selection Strategies The Complete Flow: Selection → Sweep → Registration This pipeline implements a three-phase strategy for model optimization, each with a specific purpose: Step 05: Model Selection ├── Compares 5 algorithms with basic GridSearch (5-10 combos/model) ├── Objective: Identify best model family (Random Forest vs Gradient Boosting vs ...) ├── Primary metric: MAPE (Mean Absolute Percentage Error) └── Output: Best algorithm + initial parameters Step 06: Hyperparameter Sweep ├── Optimizes ONLY the best algorithm from Step 05 ├── Bayesian optimization with 50+ runs (exhaustive search space) ├── Objective: Find optimal configuration of best model ├── Primary metric: wMAPE (Weighted MAPE, less biased) └── Output: best_params.yaml with optimal hyperparameters Step 07: Model Registration ├── Trains final model with parameters from Step 06 ├── Registers in MLflow Model Registry with rich metadata ├── Transitions to stage (Staging/Production) └── Output: Versioned model ready for deployment Why three separate steps? You don’t have computational resources to do exhaustive sweep of 5 algorithms × 50 combinations = 250 training runs. First decide strategy (which algorithm), then tactics (which hyperparameters). ...

Literary Mapping of Christmas Novels: A Vector Narrative Arc Approach

Post Objective Data cleaning and preliminary analysis process Understanding the emotional charge or plot development of texts through semantic archaeology based on PCAs Understanding the connections and most representative ideas within the document set Intention Understanding a story’s behavior at the level of its variance is a challenge addressed by attentional engineering. Therefore, using lesser-known methods such as the vector narrative arc combined with a literary map constitutes an interesting route to address increasingly common problems. ...

📬 Did this help?