Wrapped-up Readings 2025-01-31

The conversation around DeepSeek continues to gain momentum, with much chatter filling the air. Amidst the noise, there are valuable resources like the first link, which articulately explains the training processes employed for DeepSeek-R1. On a practical note, I appreciate Daniel van Strien’s article on how DeepSeek can be leveraged to annotate datasets with high quality for use with ModernBERT—a method some of us have been employing with powerful proprietary models.

However, there’s more to the AI world than just DeepSeek. Don’t overlook the new vision-language model by Qwen2.5, which appears to be quite effective in extracting information from photos and videos. Currently, my new local model of choice running on Ollama is Mistral Small 3, which has shown promising speed and quality in my initial tests. If you’re overwhelmed by the choice of models, Ethan Mollick offers a helpful guide to proprietary options. But stay tuned, as beyond DeepSeek, Qwen, and Mistral, Zuckerberg is hinting at the release of Llama 4. Lastly, don’t miss out on Marc Andreessen’s insights—he’s addressing points many industry leaders hesitate to voice publicly.

Beyond AI, I delved into an article about Airbnb lobbying for regulations that favor their interests and, on a completely different note, a fascinating piece on rethinking love, which offers various perspectives on how views on love have evolved over time.

AI

The Illustrated DeepSeek-R1: Jay Alammar’s article, “The Illustrated DeepSeek-R1,” explores the advanced training processes of DeepSeek-R1, a large language model (LLM) focused on math and reasoning tasks. Unlike traditional LLMs, DeepSeek-R1 utilizes long chains of reasoning tokens and incorporates Reinforcement Learning (RL) techniques for enhanced problem-solving. The model employs three main development stages: base model pre-training, supervised fine-tuning (SFT), and preference tuning. A novel approach, including a reasoning-oriented RL model named R1-Zero, allows for automated data verification, minimizing reliance on human-labeled data. Despite its prowess, the R1-Zero faced issues like poor readability, highlighting the need for a more general and user-friendly R1 model. This comprehensive process ensures DeepSeek-R1 excels in reasoning and general tasks, leveraging advanced Transformer architecture for greater efficiency and performance.
DeepSeek Mania Shakes AI Industry to Its Core: DeepSeek, a Chinese-developed AI model, has become the most popular app in the US Apple App Store, outperforming models from OpenAI and others while using older, cost-effective technology. This has led to a significant drop in Nvidia’s stock and concern in the US market about China’s potential AI dominance. DeepSeek’s success challenges the notion of American supremacy in AI and suggests new efficiencies in AI development. Its open-source, modifiable nature raises questions about censorship and data privacy, particularly regarding Chinese control. The development highlights a shift in AI from expensive, resource-intensive models to more efficient alternatives, leading to debate over the future direction of the AI industry.
Distiling DeepSeek Reasoning to ModernBERT Classifiers: The article by Daniel van Strien discusses the process of utilizing language models (LLMs) like DeepSeek-R1 to generate labels for fine-tuning ModernBERT models in classification tasks requiring reasoning. DeepSeek-R1 is known for its powerful reasoning capabilities, and its distilled versions retain substantial reasoning ability in smaller models. While the challenge remains in acquiring labeled data for training classifiers, generating synthetic labels is proposed as a solution. The article explores structured label generation and the use of tools like LM Studio and OpenAI’s Python client for running LLMs, emphasizing practical steps to implement these methods for efficient model training and evaluation.
Explainer: What’s R1 and Everything Else?: Tim Kellogg’s article “Explainer: What’s R1 & Everything Else?” discusses the recent developments in AI, focusing on the emergence of R1 in comparison to other models like o1 and o3. R1 is lauded for being cost-effective, open-source, and effectively validating OpenAI’s work. AI reasoning models, crucial for planning and decision-making, are distinct from agents, which require software to interact with the world. R1 is adopting a simple reinforcement learning approach for reasoning, challenging complex methods like DPO and MCTS. This approach allows for quick and affordable innovation. The piece also touches on international AI strategies, contrasting the USA’s heavy funding with China’s innovative engineering and Europe’s regulatory efforts.
Mistral Small 3: Mistral Small 3 is a competitive AI model comparable to larger models like Llama 3.3 70B and Qwen 32B, providing an open alternative to proprietary models such as GPT4o-mini. It matches Llama 3.3 70B’s performance while being over three times faster on the same hardware. The model is licensed under Apache 2.0, marking a shift from previous restrictions, allowing free download, modification, and use. Enterprises needing specialized models can access additional commercial versions.
Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL!: The Qwen Team introduces Qwen2.5-VL, their latest flagship vision-language model, showcasing substantial enhancements over its predecessor, Qwen2-VL. Key features include advanced visual understanding, agentic capabilities, and the ability to comprehend long videos by locating specific events. It excels in structured data outputs and visual localization. The model demonstrates impressive performance in document and diagram comprehension and is adaptable without task-specific fine-tuning. Qwen2.5-VL’s enhanced image and OCR recognition support multilingual and multi-orientation text, while a unique document parsing format improves metadata extraction. The model has refinements in video comprehension, using dynamic FPS training and absolute time encoding for precise content understanding. Upgraded spatial and temporal processing allows it to maintain native image resolutions without traditional normalization. Additionally, improvements in the visual encoder, including Window Attention and a refined ViT structure, enhance efficiency and reduce computational load.
Which AI to Use Now: An: Ethan Mollick provides insights on choosing AI tools available today, noting rapid advancements and new releases in AI models. He highlights Claude from Anthropic, Google’s Gemini, and OpenAI’s ChatGPT as current top choices for those starting with AI. To access cutting-edge models through easy-to-use apps, users might need to pay around $20/month. The evolution of AI includes multimodal capabilities, such as speech and vision, enhancing real-time interaction. While ChatGPT has the best multimodal Live Mode, Gemini offers powerful integrated search, and Claude excels in nuanced responses. Emerging reasoning models, like DeepSeek, add a scholarly dimension for complex queries, highlighting the importance of selecting based on specific needs and preferences among AI features and costs.
Computer-Using Agent: OpenAI has introduced a research preview of Operator, an agent that performs web tasks, powered by the Computer-Using Agent (CUA). CUA combines GPT-4o’s vision and advanced reasoning capabilities, allowing for interaction with graphical user interfaces, resembling human use. It processes raw pixel data for navigation and task completion, including handling errors and adapting to changes. CUA operates through perception, reasoning, and action, seeking user confirmation for sensitive tasks. Available to Pro users in the U.S., this preview aims to refine Operator by learning from user input, enhancing its capabilities and reliability in task execution.
We Just Gave Sight to Smolagents: Hugging Face’s article discusses the addition of vision support to smolagents, enabling the use of vision-language models in agent-based systems. This enhancement allows agents to interpret visual content on web pages, thus improving their autonomy in tasks like web browsing. Images can be integrated at the start of a task or dynamically during execution via callbacks, facilitating responsive interactions. The integration leverages tools like Helium for browser automation, but challenges remain due to variability in vision-language model effectiveness.
Quoting Mark Zuckerberg on Llama 4: Simon Willison’s Weblog highlights Mark Zuckerberg’s comments on the progress of Llama 4, an advanced model in development. Llama 4 aims to surpass previous versions by being natively multimodal and possessing agentic capabilities. This omni-model approach will allow Llama 4 to be innovative and unlock new use cases, building on Llama 3’s goal to make open-source models competitive with closed models.
Modelo De Lenguaje Generativo: The Instituto de Ingeniería del Conocimiento has developed RigoChat, a series of Spanish generative language models designed to handle complex instructions and maintain coherent conversations. These models support various natural language generation tasks and can be adapted to specific domains by retraining them with targeted corpora. Drawing on expertise from the RigoBERTa language understanding model, RigoChat has been trained with extensive Spanish texts to master linguistic patterns and styles, achieving high performance in language tasks, comparable to GPT-4.
Nvidia’s Jensen Huang Predicts IT Will Morph Into AI HR – And the Crowd Goes Wild: During Nvidia’s CES 2025 keynote, Jensen Huang envisioned a future where IT departments transition into AI human resources, managing digital workforces of AI agents. This keynote, resembling a rock concert with an enthusiastic 12,000-strong audience, boosted Nvidia’s shares by over 3%. Huang suggested IT professionals will oversee hiring, training, and supervising AI agents, customized for each company, marking a shift in corporate dynamics that was met with excitement and optimism.
Perplexity Launches Sonar, an API for AI Search: Perplexity launched Sonar, an API for integrating its AI search tools into external applications. Sonar enhances AI’s factuality by sourcing information in real-time from trusted internet sources. It’s already powering Zoom’s AI assistant, providing real-time, cited answers without leaving the chat. Sonar offers a basic plan with flat, cost-effective pricing and a Pro version for more detailed inquiries. With the AI API, Perplexity seeks to expand its revenue streams amid industry-wide price cuts, following a funding round valuing the startup at $520 million.
How to Deploy and Fine-Tune DeepSeek Models on AWS: The blog post by Hugging Face outlines steps to deploy and fine-tune DeepSeek models, specifically the DeepSeek-R1 series, on AWS. DeepSeek-R1 models, which include open-source dense models based on Llama and Qwen architectures, can be deployed using Hugging Face Inference Endpoints, offering simple deployment with cost-efficient scaling on AWS infrastructure. The guide provides detailed instructions for setting up and configuring AWS services like SageMaker and EC2 for model deployment, highlighting pre-requisites and necessary configurations for efficient infrastructure management.
Top AI Investor Says Goal Is to Crash Human Wages: Marc Andreessen, cofounder of Andreessen Horowitz, has stirred controversy by suggesting that AI should “crash” human wages to achieve an economic utopia with high productivity and negligible prices. His statement is an example of the stark economic logic some tech visionaries use, framing current economic hardships as steps toward a grand future. Critics highlight the lack of immediate, meaningful improvements to life and the insufficiency of responses like universal basic income, which Andreessen opposes. His views parallel other tech leaders, like Larry Ellison and Mira Murati, who downplay potential job losses and societal disruptions caused by advancing AI.
The Worm in the Machine: “The Worm in the Machine” by Mike James discusses the challenge of simulating biologically accurate neural networks, contrasting the complexity of living neurons with artificial ones. While neural networks with trillions of connections exist, accurately replicating biological systems like the Caenorhabditis elegans, with only 302 neurons, is a formidable task. Recent simulations of the worm replicated its behavior using a detailed model of its muscles and neural connections, but only 136 neurons were modeled, revealing the difficulty of achieving complete biological accuracy in simulations. Despite these limitations, the model effectively mimicked some real behaviors, such as zigzag movement, shedding light on the potential and constraints of detailed neural simulations.

Real Estate

Por Una Nueva Regulación Que Tenga en Cuenta Las Zonas Rurales Y Las Familias: Airbnb advocates for balanced short-term rental regulations in Spain, emphasizing the negative impacts of hastily imposed restrictions on rural areas, local businesses, and families. An Oxford Economics report highlights that short-term rentals contribute significantly to Spain’s GDP and support 400,000 jobs. Airbnb suggests a regulatory model that distinguishes between dedicated and occasional rentals, promotes fair data-driven policies, and supports rural tourism. The company stresses that extreme restrictions could harm local economies and fail to address housing and mass tourism issues effectively.

Philosophy

How to Think Differently About Love: In “How to Think Differently About Love,” Arina Pismenny explores the philosophical, cultural, and biological underpinnings of romantic love. The article delves into ancient myths like Aristophanes’ from Plato’s Symposium, evolutionary theories, and modern neuroscience to explain exclusivity, eternal love, and unrequited love. Pismenny critiques societal norms that idealize monogamous nuclear families, arguing that these constructs are rooted more in historical context than biological necessity. The piece examines concepts like “amatonormativity” and explores alternatives such as ethical nonmonogamy and polyamory, promoting a broader understanding of meaningful relationships that include friendships and community bonds. By engaging with feminist critiques and cultural scripts, Pismenny encourages readers to rethink love beyond traditional paradigms, offering insights into how love can be a fulfilling and liberating experience.

Technology

Why We’re Bringing Pebble Back: Eric Migicovsky reflects on his journey with Pebble, the smartwatch he started building in 2008. Despite the original company’s failure, a widespread community kept the Pebble’s legacy alive. Not satisfied with existing smartwatch options, Migicovsky outlines plans to revive Pebble with new hardware and open-source PebbleOS, supported by Google’s release of the source code. He aims to create a sustainable, small-scale company focusing on essential features, maintaining Pebble’s spirit and adaptability.
2024 Gems of the Year Winners: The “2024 Gems of the Year Winners” announcement celebrates the top plugins for Obsidian chosen by a panel and voted on by the community. Highlights include “Advanced Canvas” by Developer-Mike, enhancing Obsidian with tools like presentations and flowcharts. The winner in Language Models is the “Copilot” plugin by logancyang, which allows users to interact with their notes using large language models. Runner-ups include “Smart Connections” by brianpetro and “Ollama Chat” by brumik, both utilizing AI to enhance note interaction.

Management

The Looking Glass: The Valuable Employee Paradox: In “The Looking Glass: The Valuable Employee Paradox,” Julie Zhuo explores the contradiction where valuable employees both align with a manager’s directives and simultaneously challenge and innovate beyond them. Great managers value reports who offer new perspectives and question norms, while average reports tend to follow instructions without critical thinking. This dynamic emphasizes that top performers, or “Jedi,” who critically engage with projects and prioritize team success over hierarchy, are indispensable. Hierarchical systems should facilitate decision-making but can falter if they discourage non-conformity and independent judgment.

Pelayo Arbués

Explorer

Recent Notes

Why Software Engineers Should Learn a Bit of Data Science

A recommender beast

The next generation of weak learners

Wrapped-up Readings 2025-01-31

AI

Real Estate

Philosophy

Technology

Management

Graph View

Table of Contents

Now Reading

![CDATA[Not Boring by Packy McCormick]]>