🤖 AI 资讯日报 | 2026-06-07 周日

自动聚合 · 每日更新 · 数据来源：OpenAI / Anthropic / HF / GitHub / NVIDIA / Apple / xAI / Simon Willison / The Decoder / ITHome / HackerNews / arXiv

📌 T1 官方一手源（最高权重）

1. Reasoning models don't always say what they think

来源: Anthropic research latest | 评分: 0.99
摘要: Research from Anthropic on the faithfulness of AI models' Chain-of-Thought.
链接: https://www.anthropic.com/research/reasoning-models-dont-say-think...

2. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

来源: Apple ML research | 评分: 0.99
摘要: Apple is presenting new research at the annual IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), which takes place in…
链接: https://machinelearning.apple.com/updates/apple-at-cvpr-2026...

3. Harness, Scaffold, and the AI Agent Terms Worth Getting Right

来源: Hugging Face blog latest | 评分: 0.99
摘要: We're on a journey to advance and democratize artificial intelligence through open source and open science.
链接: https://huggingface.co/blog/agent-glossary...

4. NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI

来源: NVIDIA AI blog | 评分: 0.99
摘要: New physical AI agent skills, powered by NVIDIA Cosmos 3, help researchers accelerate data generation, simulation, policy training and
链接: https://blogs.nvidia.com/blog/cvpr-physical-ai-research-agent-skills/...

5. GitHub Universe is back: All together now, in the agentic era

来源: GitHub blog AI | 评分: 0.99
摘要: Software development has always been deeply collaborative. Today, that collaboration goes beyond just people, extending to tools,
链接: https://github.blog/news-insights/company-news/github-universe-is-back-all-toget...

6. GitHub Copilot app: The agent-native desktop experience

来源: GitHub blog AI | 评分: 0.98
摘要: At Microsoft Build 2026, GitHub introduced new tools, updates, and surfaces so agents can work the way you already work.
链接: https://github.blog/news-insights/product-news/github-copilot-app-the-agent-nati...

7. RVPO: Risk-Sensitive Alignment via Variance Regularization

来源: Apple ML research | 评分: 0.98
摘要: Current critic-less RLHF methods aggregate multi-objective rewards via an arithmetic mean, leaving them vulnerable to constraint neglect:…
链接: https://machinelearning.apple.com/research/rvpo-risk-sensitive-alignment...

8. Use Grok in Kilo Code

来源: xAI news | 评分: 0.98
摘要: Use your SuperGrok or X Premium+ subscription inside Kilo Code, the open-source agentic coding platform.
链接: https://x.ai/news/grok-kilocode...

9. How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

来源: Hugging Face blog latest | 评分: 0.98
摘要: A Blog post by NVIDIA on Hugging Face.
链接: https://huggingface.co/blog/nvidia/fine-tuning-nemotron-35-asr...

10. Anthropic Economic Index report: Economic primitives

来源: Anthropic research latest | 评分: 0.98
摘要: This report introduces new metrics of AI usage to provide a rich portrait of interactions with Claude in November 2025, just prior to the
链接: https://www.anthropic.com/research/anthropic-economic-index-january-2026-report...

11. How Perplexity Brought Voice Search to Millions Using the Realtime API

来源: OpenAI blog latest | 评分: 0.98
摘要: Lessons from how Perplexity Computer's voice agent was built with the Realtime API.
链接: https://developers.openai.com/blog/realtime-perplexity-computer...

12. OpenAI Help: Lockdown Mode

来源: Simon Willison blog | 评分: 0.98
摘要: Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network
链接: https://simonwillison.net/2026/Jun/5/openai-help-lockdown-mode/...

13. An update on our model deprecation commitments for Claude Opus 3

来源: Anthropic research latest | 评分: 0.98
摘要: Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
链接: https://www.anthropic.com/research/deprecation-updates-opus-3...

14. The last six months in LLMs in five minutes

来源: Simon Willison blog | 评分: 0.98
摘要: 19th May 2026. I put together these annotated slides from my five minute lightning talk at PyCon US 2026, using the latest iteration of my
链接: https://simonwillison.net/2026/may/19/5-minute-llms/...

15. Thousand Token Wood: shipping a multi-agent economy on a 3B model

来源: Hugging Face blog latest | 评分: 0.98
摘要: A Blog post by Build Small Hackathon on Hugging Face.
链接: https://huggingface.co/blog/build-small-hackathon/thousand-token-wood-sim...

16. NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local

来源: NVIDIA AI blog | 评分: 0.98
摘要: At Microsoft Build, NVIDIA founder and CEO Jensen Huang joined Microsoft chairman and CEO Satya Nadella's keynote via livestream from Taipei
链接: https://blogs.nvidia.com/blog/microsoft-build-windows-local-cloud-devices/...

17. Testing Agent Skills Systematically with Evals

来源: OpenAI blog latest | 评分: 0.98
摘要: A practical guide to turning agent skills into something you can test, score, and improve over time.
链接: https://developers.openai.com/blog/eval-skills...

18. Grok Imagine 1.5 Preview

来源: xAI news | 评分: 0.98
摘要: grok-imagine-video-1.5-preview, our latest image-to-video model, is now available via the xAI API in preview.
链接: https://x.ai/news/grok-imagine-1-5...

19. Velox: Learning Representations of 4D Geometry and Appearance

来源: Apple ML research | 评分: 0.98
摘要: We introduce a framework for learning latent representations of 4D objects which are descriptive, faithfully capturing object geometry and…
链接: https://machinelearning.apple.com/research/velox...

20. Larger context windows and configurable reasoning levels for GitHub Copilot

来源: GitHub blog AI | 评分: 0.98
摘要: GitHub Copilot now supports larger context windows and configurable reasoning levels to help you tackle deeper, more complex work.
链接: https://github.blog/changelog/2026-06-04-larger-context-windows-and-configurable...

📰 T1.5 媒体 + 社区热帖

1. OpenAI CEO Sam Altman sees "proactive AI" as the next big phase after chatbots and agents

来源: AI news today the-decoder
摘要: OpenAI CEO Sam Altman outlines the next phase of AI products: a "proactive AI" that runs constantly in the background and acts on its own
链接: https://the-decoder.com/openai-ceo-sam-altman-sees-proactive-ai-as-the-next-big-...

2. xAI updates Grok Imagine to 1.5 with image-to-video generation at 720p resolution

来源: AI news today the-decoder
摘要: Elon Musk's AI company xAI has released Grok Imagine Video 1.5 in preview, a new image-to-video model. The model turns a single still image
链接: https://the-decoder.com/xai-updates-grok-imagine-to-1-5-with-image-to-video-gene...

3. Anthropic's Mythos model is reportedly powering NSA offensive cyber ops against China and Iran

来源: AI news today the-decoder
摘要: Anthropic has reportedly stationed about half a dozen engineers directly at the NSA to adapt its Mythos AI model for offensive cyber
链接: https://the-decoder.com/anthropics-mythos-model-is-reportedly-powering-nsa-offen...

4. MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders

来源: AI news today the-decoder
摘要: Chinese AI company MiniMax has released its new model M3. It's billed as the first open-weight model to combine top-tier coding performance,
链接: https://the-decoder.com/minimax-m3-open-weight-model-with-a-million-token-contex...

5. Nvidia's Nemotron 3 Ultra becomes the smartest open US model, but China still leads

来源: AI news today the-decoder
摘要: According to benchmark platform Artificial Analysis, Nvidia's new Nemotron 3 Ultra is the most capable open AI model from the US to date.
链接: https://the-decoder.com/nvidias-nemotron-3-ultra-becomes-the-smartest-open-us-mo...

6. Meta confirms 1000s of Instagram accounts were hacked by abusing its AI chatbot

来源: HackerNews
链接: https://this.weekinsecurity.com/meta-confirms-thousands-of-instagram-accounts-we...

7. Harness engineering: Leveraging Codex in an agent-first world

来源: HackerNews
链接: https://openai.com/index/harness-engineering/...

8. Nvidia is proposing a beast of a CPU system for Windows PCs

来源: HackerNews
链接: https://twitter.com/lemire/status/2062880075117113739...

9. Computex 2026: Are We Heading for the Agentic PC Era Yet?

来源: HackerNews
链接: https://www.eetimes.com/computex-2026-are-we-heading-for-the-agentic-pc-era-yet/...

10. Google to pay SpaceX $920M a month for compute capacity at xAI data centers

来源: HackerNews
链接: https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai...

💬 T2 KOL 观点 + 学术论文

1. [2605.15156] MeMo: Memory as a Model

来源: AI LLM arxiv
摘要: Abstract:Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until
链接: https://arxiv.org/abs/2605.15156...

2. Advancing Mathematics Research with AI-Driven Formal Proof Search

来源: AI LLM arxiv
摘要: Abstract. Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in
链接: https://arxiv.org/html/2605.22763v1...

3. [2605.06445] Constraint Decay: The Fragility of LLM Agents in Backend Code Generation

来源: AI LLM arxiv
摘要: Abstract page for arXiv paper 2605.06445: Constraint Decay: The Fragility of LLM Agents in Backend Code Generation.
链接: https://arxiv.org/abs/2605.06445...

4. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

来源: AI LLM arxiv
摘要: This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and
链接: https://arxiv.org/abs/2506.08872...

5. [1706.03762] Attention Is All You Need

来源: AI LLM arxiv
摘要: We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.
链接: https://arxiv.org/abs/1706.03762...

📊 本期统计

指标	数值
T1 官方源	43 条
T1.5 媒体/社区	12 条
T2 KOL/论文	5 条
合计	60 条

🤖 本报告由 Hermes Agent 自动生成 | 2026-06-07 08:31