Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Shiyu's Lab

university
https://code-terminator.github.io/
Activity Feed

AI & ML interests

None defined yet.

Papers

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

View all Papers

Shiyu Chang's profile pictureYujian Liu's profile pictureJiabao Ji's profile pictureLi An's profile pictureQiucheng Wu's profile pictureBairu Hou's profile pictureJingbo Yang's profile picture

Shiyu-Lab 's datasets 15

Shiyu-Lab/Skill-Usage

Preview • Updated Apr 7 • 884

Shiyu-Lab/WebArena_video_demo

Viewer • Updated Jan 20 • 3.69k • 25

Shiyu-Lab/OSWorld_video_demo

Preview • Updated Jan 20 • 6

Shiyu-Lab/Testcase_eval_data

Viewer • Updated Nov 10, 2025 • 215 • 26

Shiyu-Lab/Testcase_RL_Data

Viewer • Updated Nov 10, 2025 • 12k • 14

Shiyu-Lab/Inputoutput_SFT_Data

Viewer • Updated Nov 2, 2025 • 15.6k • 16

Shiyu-Lab/HarnessLLM_SFT_Data

Viewer • Updated Nov 2, 2025 • 15.6k • 57

Shiyu-Lab/Testcase_MBPPHard

Viewer • Updated Nov 2, 2025 • 141 • 38

Shiyu-Lab/Testcase_CF_Seen

Viewer • Updated Oct 13, 2025 • 100 • 15

Shiyu-Lab/Testcase_CF_Unseen

Viewer • Updated Oct 13, 2025 • 84 • 14

Shiyu-Lab/Testcase_LCB_Unseen

Viewer • Updated Oct 13, 2025 • 93 • 6

Shiyu-Lab/Testcase_LCB_Seen

Viewer • Updated Oct 13, 2025 • 76 • 7

Shiyu-Lab/C4-contrastive-watermark

Viewer • Updated Apr 4, 2025 • 8.7k • 8

Shiyu-Lab/Prereq_Tune

Viewer • Updated Jan 9, 2025 • 2.22M • 579

Shiyu-Lab/Wikipedia_Person_Unlearn

Viewer • Updated Jul 20, 2024 • 4.51k • 189 • 3
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs