🤖 AI 资讯日报 — 2026-06-23 周二

自动生成于 2026-06-23 08:32 | 数据源: T1 官方 / T1.5 媒体 / T2 社区

🏆 T1 官方一手（最高权重）

📌 OpenAI

Designing delightful frontends with GPT-5.4
Practical techniques for steering GPT-5.4 toward polished, production-ready frontend designs.
Testing Agent Skills Systematically with Evals
A practical guide to turning agent skills into something you can test, score, and improve over time.
Using skills to accelerate OSS maintenance
Using skills and GitHub Actions to optimize Codex workflows in the OpenAI Agents SDK repos.
Blog
OpenAI Developer Blog. Insights for developers building with OpenAI. How Perplexity Brought Voice Search to Millions Using the Realtime API.
Codex blog posts
Trigger published ChatGPT workspace agents · Commerce. Build commerce flows in ChatGPT · Ads. Publish and measure ads in ChatGPT.

📌 Anthropic

Project Fetch: Phase two
We report results from our latest test of whether Claude can help Anthropic employees perform sophisticated robotics tasks.
Agentic coding and persistent returns to expertise
This report provides evidence on how Claude Code is used in practice, based on a privacy-preserving analysis of around 400000 interactive
Research
Research. Our research teams investigate the safety, inner workings, and societal impacts of AI models—so that artificial intelligence has a positive impact as
The assistant axis: situating and stabilizing the character of large language models
Left: Character archetypes form a "persona space," with the Assistant at one extreme of the "Assistant Axis." Right: Capping drift along
Labor market impacts of AI: A new measure and early evidence
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

📌 HuggingFace

Beyond LoRA: Can you beat the most popular fine-tuning technique?
We're on a journey to advance and democratize artificial intelligence through open source and open science.
GLM-5.2: Built for Long-Horizon Tasks
A Blog post by Z.ai on Hugging Face.
Introducing North Mini Code: Cohere's First Model For Developers
A Blog post by Cohere Labs on Hugging Face.
Community Blog & Articles
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Is it agentic enough? Benchmarking open models on your own tooling
We're on a journey to advance and democratize artificial intelligence through open source and open science.

📌 GitHub

AI credits consumed per user now in the Copilot usage metrics API
The Copilot usage metrics API now reports how many AI credits each user consumed per day, derived from the same AI credits consumption data
Getting more from each token: How Copilot improves context handling and model routing
How GitHub Copilot is making more of each session go toward useful work, so your credits go further.
What are git worktrees, and why should I use them?
Git worktrees have been around since 2015, but it wasn't until recently they became popular. Learn what they are, how to use them, and why
MAI-Code-1-Flash available on more Copilot surfaces
MAI‑Code‑1‑Flash, Microsoft's purpose‑built small coding model, is now available across additional GitHub Copilot surfaces.
How we built an internal data analytics agent
Learn how GitHub built Qubot, our internal Copilot-powered analytics agent, to allow any GitHub employee to ask questions about our data in

📌 Apple ML Research

📌 NVIDIA Blog

Hotter Than a Hot Tub: The 45°C Breakthrough to Cool AI's Biggest Machines

📌 xAI

📌 Simon Willison

sqlite-utils 4.0rc1 adds migrations and nested transactions

📰 T1.5 媒体报道

Sakana AI's Fugu orchestrates multiple LLMs to match Anthropic's Fable and Mythos benchmarks (The Decoder)
Landmark German ruling declares Google's AI Overviews are Google's own words and makes it liable for false answers (The Decoder)
Microsoft researcher builds a working neural network out of goats in Age of Empires II to critique AI science (The Decoder)
Vibecoding is becoming a deal-breaker test for software acquisitions (The Decoder)
ChatGPT's new health upgrade beats doctor-written answers, OpenAI says (The Decoder)
NAIRR Science Program Reshapes Scientific Research, Powered by NVIDIA AI Infrastructure (ithome)
NVIDIA Vera Rubin Delivers World-Class Supercomputers for Science (ithome)
Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?" (HackerNews)
Points: 553
Coconut by Meta AI – Better LLM Reasoning with Chain of Continuous Thought? (HackerNews)
Points: 362
Show HN: Countless.dev – A website to compare every AI model: LLMs, TTSs, STTs (HackerNews)
Points: 361
Taste in the age of AI and LLMs (HackerNews)
Points: 265
Wikimedia Enterprise – APIs for LLMs, AI Training, and More (HackerNews)
Points: 222
Anti-AI Hype LLM Reading List (HackerNews)
Points: 208
Jellyfin LLM/"AI" Development Policy (HackerNews)
Points: 207
Ask HN: Go deep into AI/LLMs or just use them as tools? (HackerNews)
Points: 195
Advent of Code 2023's new AI/LLM Policy (HackerNews)
Points: 174
Should we use AI and LLMs for Christian apologetics? (2024) (HackerNews)
Points: 170
RFC: Banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo (HackerNews)
Points: 168
Benchmarks and comparison of LLM AI models and API hosting providers (HackerNews)
Points: 152
Pi.ai LLM Outperforms Palm/GPT3.5 (HackerNews)
Points: 151
A guide to Gen AI / LLM vibecoding for expert programmers (HackerNews)
Points: 128
Declarative Programming with AI/LLMs (HackerNews)
Points: 121

🔬 T2 社区 & 学术

📚 Twitter KOL

AI KOL Twitter Accounts Overview - Meng Shao, Wang Shuyi, HongyuanCao
Meng Shao (@shao__meng) is an AI paper open-source project speed mailer expert. Wang Shuyi (@wshuyi) is an AI professor focused on academic sharing. HongyuanCao (@HongyuanCao) is an AI researcher and ...
Sahara AI - Blockchain-based AI Infrastructure for Asset Control
Sahara AI is an infrastructure for 'AI development' and 'AI data/model assetization,' building a blockchain-based AI platform for secure control and copyright protection of AI assets.
Clawbot - Personal AI Assistant Service Strategy
Clawbot's core is 'proactive agent' that can access local files, browser and Apps, simulating a private secretary: auto-reply emails, organize schedule, generate daily reports.

📚 ArXiv

From AGI to ASI
Tim Genewein, Matija Franklin, Alexander Lerchner, Laurent Orseau - Google DeepMind. Published: Jun 09, 2026.
Patterns, Predictions, and Actions: A Story about Machine Learning
This graduate textbook on machine learning tells a story of how patterns in data support predictions and consequential actions.
Optimal Transport for Machine Learners
Optimal transport is useful because it compares objects by asking how mass should move. Published: May 10, 2025.
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
This paper provides an overview of the computational and theoretical foundations of multimodal machine learning. Published: Sep 07, 2022.
Machine Learning Methods for Studying Latent Neural Activity Dynamics
Recent developments in brain recording are driving demand for machine learning tools capable of decoding latent structure. Published: Jun 09, 2026.
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery
Large language model (LLM) agents are used for automated machine learning algorithm discovery. Published: Jun 09, 2026.
Machine Learning in Biomechanics: Key Applications and Limitations in Walking, Running, and Sports Movements
Overview of recent ML applications: pose estimation, feature estimation, event detection. Published: Mar 05, 2025.
Position: The AI and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
Position paper advocating for more transparent, open, and well-regulated peer review in the AI/ML community. Published: Feb 02, 2025.

📚 HF Daily Papers

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning ⬆️1
Medical tabular data are ubiquitous in clinical research, but deep learning for tables remains underexplored because reliable labels often require costly expert adjudication, even though structured cl...
Characterizing Narrative Content in Web-scale LLM Pretraining Data ⬆️2
The narrative composition of web-scale LLM pretraining corpora remains largely unexplored even though narrative is a fundamental mode of human communication. We present the first fine-grained study of...
SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG ⬆️8
Retrieval-augmented generation (RAG) systems must balance retrieval granularity with contextual coherence, a challenge that existing methods address through LLM-guided chunking, single-level context e...
MCompassRAG: Topic Metadata as a Semantic Compass for Paragraph-Level Retrieval ⬆️10
Retrieval-augmented generation (RAG) systems depend critically on how documents are chunked and searched. Fine-grained chunks can improve retrieval precision but expand the search space, increasing la...
StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs ⬆️2
Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly under...
MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision ⬆️14
Personalized presentation generation requires more than conditioning on a current prompt or template: agents must preserve stable user preferences across tasks, retain newly introduced preferences and...
Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models ⬆️9
While reasoning on autoregressive (AR) models is often performed by chain-of-thought reasoning and reflection, their refinement of previous outputs still relies on fully sequential generation, even wh...
GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning ⬆️3
Generalist vision-language-action systems need object-centric 3D evidence and reusable manipulation experience to plan reliable robot trajectories. GeneralVLA provides a hierarchical interface for con...
SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction ⬆️3
High-quality 4D head avatars from one or a few source portraits are central to telepresence, AR/VR, and digital-human interaction. 3D Gaussian Splatting (3DGS) has emerged as the dominant representati...
Distilling Examples into Task Instructions: Enhanced In-Context Learning for Real-World B2B Conversations ⬆️2
In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantical...
GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents ⬆️13
Memory benchmarks for LLM agents largely assume single-user settings, leaving shared assistants for hospitals, workplaces, campuses, and households understudied. In these deployments, multiple princip...
BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation ⬆️7
Three-dimensional (3D) brain MRI is central to clinical neurology and neuro-oncology, where generative models could augment under-represented cohorts, simulate disease trajectories, and support privac...
WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents ⬆️3
To assist humans over extended periods in real homes, embodied agents must remember user routines, world states, and past interactions. Existing long-term memory benchmarks mainly evaluate language-ce...
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models ⬆️50
Multimodal large language models (MLLMs) have achieved remarkable progress in visual understanding tasks. However, most existing MLLMs rely on autoregressive generation, which limits their efficiency ...
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents ⬆️6
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifie...

本报告由 Hermes Agent 自动生成 | 2026-06-23