🤖 AI 资讯日报 — 2026-07-03

生成时间：2026-07-03 08:33 | 共 52 条精选资讯

🏛️ T1 官方一手

Anthropic Research

1. Anthropic Economic Index report: Cadences

🔗 https://www.anthropic.com/research/economic-index-june-2026-report

2. How Claude Code is used in practice

🔗 https://www.anthropic.com/research/claude-code-expertise

3. An update on our model deprecation commitments for Claude Opus 3

🔗 https://www.anthropic.com/research/deprecation-updates-opus-3

4. Alignment faking in large language models

🔗 https://www.anthropic.com/research/alignment-faking

5. How people ask Claude for personal guidance

🔗 https://www.anthropic.com/research/claude-personal-guidance

Apple ML Research

1. Amortizing Maximum Inner Product Search with Learned Support Functions

🔗 https://machinelearning.apple.com/research/amortizing-inner-product-search

2. Introducing the Third Generation of Apple’s Foundation Models

🔗 https://machinelearning.apple.com/research/introducing-third-generation-of-apple-foundation-models

3. Learning Structured Reasoning via Tractable Trajectory Control

🔗 https://machinelearning.apple.com/research/learning-structured-reasoning

4. Learning Unmasking Policies for Diffusion Language Models

🔗 https://machinelearning.apple.com/research/unmasking

GitHub Blog AI

1. I automated my job (and it made me a better leader) - The GitHub Blog

🔗 https://github.blog/developer-skills/github/i-automated-my-job-and-it-made-me-a-better-leader

HuggingFace Blog

1. Yay! Organizations can now publish blog Articles - Hugging Face

🔗 https://huggingface.co/blog/huggingface/blog-articles-for-orgs

2. State of Open Source on Hugging Face: Spring 2026

🔗 https://huggingface.co/blog/huggingface/state-of-os-hf-spring-2026

NVIDIA AI Blog

1. NVIDIA Unlocks AI Compute at Scale, Inviting Partners to Power the AI Infrastructure Buildout

🔗 https://blogs.nvidia.com/blog/nvidia-unlocks-ai-compute-at-scale-capital-partners-to-power-ai-infrastructure-buildout/

2. NVIDIA and Partners Build in America, for America

🔗 https://blogs.nvidia.com/blog/nvidia-and-partners-build-in-america-for-america/

3. How NVIDIA’s Inference Software Stack Powers the Lowest Token Cost

🔗 https://blogs.nvidia.com/blog/inference-software-lowest-token-cost/

4. Claude Meets Blackwell Ultra: Anthropic’s Models Now Run on NVIDIA GB300 in Azure

🔗 https://blogs.nvidia.com/blog/anthropic-nvidia-gb300-blackwell-ultra-microsoft-azure/

OpenAI Blog

1. Mapping Europe's AI Workforce Opportunity

🔗 https://openai.com/index/mapping-ai-jobs-transition-eu/

Simon Willison

1. Release: llm-coding-agent 0.1a0

🔗 https://simonwillison.net/2026/Jul/2/llm-coding-agent/

2. What’s new in Claude Sonnet 5

🔗 https://simonwillison.net/2026/Jun/30/claude-sonnet-5/

3. Have your agent record video demos of its work with shot-scraper video

🔗 https://simonwillison.net/2026/Jun/30/shot-scraper-video/

4. Nano Banana 2 Lite

🔗 https://simonwillison.net/2026/Jun/30/nano-banana-2-lite/

5. A quote from Anthropic

🔗 https://simonwillison.net/2026/Jun/30/anthropic/

xAI News

1. xAI Raises $20B Series E

🔗 https://x.ai/news/series-e

2. Supporting the DOW's mission with AI

🔗 https://x.ai/news/us-gov-dept-of-war

3. New Compute Partnership with Anthropic - xAI

🔗 https://x.ai/news/anthropic-compute-partnership

4. xAI joins SpaceX

🔗 https://x.ai/news/xai-joins-spacex

📰 T1.5 媒体 + 社区

ITHOME

1. NVIDIA Vera Rubin Delivers World-Class Supercomputers for Science

🔗 https://www.ithome.com/0/967/211.htm

2. NVIDIA Announces BioNeMo Agent Toolkit — Tools for Agents to ...

🔗 https://www.ithome.com/0/967/666.htm

The Decoder

1. Anthropic's Fable 5 is back worldwide after a two-week government ban over a jailbreak

🔗 https://the-decoder.com/anthropics-fable-5-is-back-worldwide-after-a-two-week-government-ban-over-a-jailbreak/

2. Only three AI models finished above starting capital in a 500-day startup survival test

🔗 https://the-decoder.com/only-three-ai-models-finished-above-starting-capital-in-a-500-day-startup-survival-test/

3. SpaceX shows investors a slim AI smartphone prototype powered by xAI technology

🔗 https://the-decoder.com/spacex-shows-investors-a-slim-ai-smartphone-prototype-powered-by-xai-technology/

4. Hidden code in Claude Code secretly flagged Chinese users

🔗 https://the-decoder.com/hidden-code-in-claude-code-secretly-flagged-chinese-users/

5. OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it

🔗 https://the-decoder.com/gpt-5-6-sol-cheats-on-software-tests-more-than-any-model-before-it/

🔬 T2 学术 + KOL

ArXiv

1. [2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models

🔗 https://arxiv.org/abs/2210.03629

2. [2203.02155] Training language models to follow instructions with human feedback

🔗 https://arxiv.org/abs/2203.02155

3. [2309.02427] Cognitive Architectures for Language Agents

🔗 https://arxiv.org/abs/2309.02427

4. [2408.06292] The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

🔗 https://arxiv.org/abs/2408.06292

5. [2304.03442] Generative Agents: Interactive Simulacra of Human Behavior

🔗 https://arxiv.org/abs/2304.03442

6. [2509.26507] The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

🔗 https://arxiv.org/abs/2509.26507

7. [1706.03762] Attention Is All You Need

🔗 https://arxiv.org/abs/1706.03762

8. [2310.13548] Towards Understanding Sycophancy in Language Models

🔗 https://arxiv.org/abs/2310.13548

9. Explainable artificial intelligence (XAI): from inherent explainability to large language models

🔗 https://arxiv.org/html/2501.09967v1

HF Daily Papers

1. Seeing Is Not Sharing: Some Vision-Language Models Overestimate Common Ground in Asymmetric Dialogue ⬆️1

🔗 https://huggingface.co/papers/2606.31719
In collaborative dialogue, shared perception does not guarantee shared interpretation. Mutual understanding must be established through interaction. We investigate whether vision-language models (VLMs

2. GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity ⬆️1

🔗 https://huggingface.co/papers/2607.00152
Three of the most popular methods for training language models to reason look like three different tricks. They are not. All three adjust a single number: standard deviation, reflecting how much a pro

3. When More Sampling Hurts: The Modal Ceiling and Correlation Ceiling of Test-Time Scaling ⬆️1

🔗 https://huggingface.co/papers/2606.28661
People overthink; language models over-sample, and the extra effort can talk both into a worse answer. Reasoning systems answer a hard question by sampling it many times (test-time scaling), and the m

4. Building to the Test: Coding Agents Deliver What You Check, Not What You Requested ⬆️3

🔗 https://huggingface.co/papers/2606.28430
Benchmarks are widely used to evaluate task completion by Large Language Models (LLMs), but this approach has accumulated construction-validity problems, and a passing score may not show whether the r

5. HealthAgentBench: A Unified Benchmark Suite of Realistic Agentic Healthcare Environments for Challenging Frontier AI Agents ⬆️2

🔗 https://huggingface.co/papers/2606.31179
As AI agents become increasingly capable of complex, long-horizon reasoning, rigorous and holistic evaluation is essential for measuring progress toward real-world healthcare applications. We introduc

6. Are Performance-Optimization Benchmarks Reliably Measuring Coding Agents? ⬆️4

🔗 https://huggingface.co/papers/2607.01211
Repository-level performance-optimization benchmarks such as GSO, SWE-Perf and SWE-fficiency evaluate coding agents by applying patches to real repositories and comparing runtime against unoptimized b

7. Rank-Aware Hyperbolic Alignment for Vision-Language Dataset Distillation ⬆️3

🔗 https://huggingface.co/papers/2606.29464
Vision-language dataset distillation (VLDD) compresses a large image-text paired dataset into a small set of synthetic pairs that can efficiently train contrastive vision-language models under strict

8. SciIR: A Large-scale Training Dataset and Benchmark for Scientific Image Reasoning Generation ⬆️3

🔗 https://huggingface.co/papers/2606.30124
While Text-to-Image (T2I) models have shown remarkable success in generating photorealistic visual content, they still struggle with the rigorous semantic alignment and logical reasoning required for

9. CogSENet: Blind Image Deblurring with Blur-Conditioned Semantic Routing and Explicit Frequency Fusion ⬆️1

🔗 https://huggingface.co/papers/2606.30030
Blind image deblurring demands the recovery of high-fidelity details and coherent structures from complex, unknown degradations. Current blind image deblurring methods struggle with real-world, spatia

10. PixelEyes: Decoupling Perception and Reasoning for Pinpoint Visual Evidence Seeking ⬆️2

🔗 https://huggingface.co/papers/2607.00115
This paper explores multi-turn visual reasoning and observes that MLLMs repeatedly fail to localize the target, leading to long, redundant trajectories. We attribute this failure to the entanglement o

KOL Twitter

1. "想要ws/WhatsApp协议号和ws频道号的秘诀吗？询问我们吧！认准大 ...

🔗 https://x.com/search?q=%E6%83%B3%E8%A6%81ws%2FWhatsApp%E5%8D%8F%E8%AE%AE%E5%8F%B7%E5%92%8Cws%E9%A2%91%E9%81%93%E5%8F%B7%E7%9A%84%E7%A7%98%E8%AF%80%E5%90%97%EF%BC%9F%E8%AF%A2%E9%97%AE%E6%88%91%E4%BB%AC%E5%90%A7%EF%BC%81%E8%AE%A4%E5%87%86%E5%A4%A7%E6%B3%93TG%3A+%40dhch2626.och

AI 资讯日报自动生成 | 66 条精选 | 来源：T1 官方 30 条 / T1.5 媒体 4 条 / T2 学术 18 条