Skip to main content
TjMakeBot Blogtjmakebot.com

Security Surveillance AI: A Complete Guide to Face and Behavior Recognition Labeling

TjMakeBot TeamIndustry Applications20 min
Industry ApplicationsSecurity AI
Security Surveillance AI: A Complete Guide to Face and Behavior Recognition Labeling

🔐 Introduction: The Data Challenge of Intelligent Security

Security surveillance is one of the most mature fields for AI deployment. According to IDC data, the global intelligent video surveillance market exceeded $50 billion in 2025, with China accounting for over 40%. From facial recognition gates at airports and train stations to behavior analysis systems in shopping malls and campuses, AI is redefining the boundaries of the security industry.

However, the high-precision requirements of security AI pose significant challenges for data labeling:

  • Face recognition must maintain high accuracy under complex lighting, multiple angles, and occlusion conditions
  • Behavior recognition requires understanding body posture, action sequences, and scene semantics
  • Privacy compliance demands personal information protection throughout the entire data processing pipeline

This article takes a deep dive into labeling techniques for security surveillance AI, covering face detection, facial landmarks, body pose, behavior recognition, and more, helping you build high-quality security datasets.

🎯 Core Tasks in Security AI Labeling

1. Face Detection and Recognition

Face Detection: Locating all faces in an image and outputting bounding box coordinates.

Labeling elements:

  • Bounding Box: The smallest rectangle containing the complete face
  • Confidence: The certainty of face presence
  • Occlusion Level: The proportion of the face that is occluded

Facial Landmarks: Labeling key feature points on the face, used for face alignment and expression analysis.

Common landmark schemes:

  • 5-point: Left eye center, right eye center, nose tip, left mouth corner, right mouth corner
  • 68-point: Detailed facial contour, eyebrows, eyes, nose, mouth
  • 106-point: More refined facial features, suitable for beauty filters, face swapping, etc.

Face Attributes: Labeling various attribute information of faces:

  • Gender: Male/Female
  • Age Group: Child/Youth/Middle-aged/Elderly
  • Expression: Neutral/Smile/Surprise/Anger/Sadness, etc.
  • Accessories: Glasses/Mask/Hat, etc.
  • Pose Angles: Pitch/Yaw/Roll

2. Person Detection and Pose Estimation

Person Detection: Locating all human bodies in an image.

Labeling elements:

  • Full-body bounding box: Containing the complete body
  • Visible part box: Containing only the visible body parts
  • Occlusion markers: Labeling occluded body parts

Body Keypoints: Labeling key skeletal nodes of the human body for pose estimation.

COCO format 17-point scheme:

0: Nose (nose)
1: Left Eye (left_eye)
2: Right Eye (right_eye)
3: Left Ear (left_ear)
4: Right Ear (right_ear)
5: Left Shoulder (left_shoulder)
6: Right Shoulder (right_shoulder)
7: Left Elbow (left_elbow)
8: Right Elbow (right_elbow)
9: Left Wrist (left_wrist)
10: Right Wrist (right_wrist)
11: Left Hip (left_hip)
12: Right Hip (right_hip)
13: Left Knee (left_knee)
14: Right Knee (right_knee)
15: Left Ankle (left_ankle)
16: Right Ankle (right_ankle)

Each keypoint requires labeling:

  • Coordinates (x, y)
  • Visibility (visible/occluded/not_labeled)

3. Behavior Recognition

Action Classification: Identifying the type of action a person is performing.

Common actions in security scenarios:

  • Normal behaviors: Walking, standing, sitting, talking, using a phone
  • Suspicious behaviors: Loitering, looking around, following, gathering
  • Abnormal behaviors: Running, falling, fighting, climbing over, trespassing

Temporal Action Annotation: Labeling the start and end times of actions in video.

Annotation format:

{
  "video_id": "camera_01_20260130",
  "actions": [
    {
      "action": "walking",
      "person_id": 1,
      "start_frame": 0,
      "end_frame": 150,
      "start_time": "00:00:00",
      "end_time": "00:00:05"
    },
    {
      "action": "running",
      "person_id": 1,
      "start_frame": 151,
      "end_frame": 300,
      "start_time": "00:00:05",
      "end_time": "00:00:10"
    }
  ]
}

💡 Labeling Strategies and Best Practices

Strategy 1: Face Labeling Standards

Bounding Box Labeling Rules:

Rule 1: Box Coverage
- Include the complete facial area (from hairline to chin)
- Include ears (if visible)
- Do not include excessive background (margin within 10% of face width)

Rule 2: Occlusion Handling
- Occlusion <30%: Label the complete face box normally
- Occlusion 30%-70%: Label the visible part, mark as "partially occluded"
- Occlusion >70%: Label the visible part, mark as "severely occluded"

Rule 3: Special Cases
- Profile (>45°): Label the visible facial area
- Blurry faces: If recognizable as a face, still label it
- Faces in photos/posters: Decide based on project requirements

Landmark Labeling Rules:

Rule 1: Precise Positioning
- Eyes: Label the pupil center
- Nose: Label the most protruding point of the nose tip
- Mouth: Label the mouth corners and midpoints of upper and lower lips

Rule 2: Occlusion Handling
- Occluded keypoints: Label the estimated position, mark as "occluded"
- Completely invisible: Mark as "not_visible"

Rule 3: Consistency
- For the same person in the same video, keypoint positions should remain coherent
- Avoid inter-frame jitter

Attribute Labeling Rules:

Age Group Classification:
- Child: 0-12 years
- Teenager: 13-17 years
- Young Adult: 18-35 years
- Middle-aged: 36-55 years
- Elderly: 56 years and above

Expression Classification:
- Neutral: No obvious expression
- Happy: Corners of mouth raised, may show teeth
- Surprised: Eyebrows raised, mouth open
- Angry: Brows furrowed, lips pressed together
- Sad: Brows drooping, corners of mouth pulled down
- Fearful: Eyes wide open, mouth slightly open
- Disgusted: Nose wrinkled, upper lip raised

Strategy 2: Body Pose Labeling Standards

Keypoint Positioning Principles:

Joint Point Positioning:
- Shoulders: Center of the shoulder joint
- Elbows: Bend point of the elbow joint
- Wrists: Center of the wrist joint
- Hips: Center of the hip joint (waistband position)
- Knees: Center of the knee joint
- Ankles: Center of the ankle joint

Facial Point Positioning:
- Nose: Nose tip
- Eyes: Center of the eyeball
- Ears: Center of the ear

Occlusion and Visibility Labeling:

Visibility Levels:
- 2: Fully visible, can be precisely located
- 1: Occluded but position can be inferred
- 0: Not visible, position cannot be inferred

Occlusion Types:
- Self-occlusion: Occluded by other parts of one's own body
- Person occlusion: Occluded by other people
- Object occlusion: Occluded by objects in the scene
- Out of bounds: Beyond the image boundary

Multi-Person Scene Handling:

Person ID Assignment:
- Assign a unique ID to each person
- Maintain consistent IDs within the same video
- Assign new IDs to newly appearing persons

Overlap Handling:
- Label each person's complete skeleton separately
- Mark visibility for occluded keypoints
- Record occlusion relationships

Strategy 3: Behavior Labeling Standards

Action Boundary Definition:

Action Start:
- The first frame where the preparatory movement begins
- Example: Running starts from the foot leaving the ground

Action End:
- The last frame where the action is completed
- Example: Running ends when the foot lands and is still

Transition Handling:
- Transition frames between two actions
- Can be labeled as either the previous or next action
- Maintain labeling consistency

Compound Action Handling:

Simultaneous Actions:
- Example: Walking while talking on the phone
- Label the primary action (walking)
- Additionally label the secondary action (phone call)

Sequential Actions:
- Example: Walking → Running → Stopping
- Label each action segment separately
- Ensure temporal continuity without overlap

Abnormal Behavior Labeling:

Abnormal Behavior Types:
- Falling: Person suddenly falls from standing/walking position
- Fighting: Physical conflict between two or more people
- Climbing over: Crossing fences, barriers, or other obstacles
- Trespassing: Entering restricted areas
- Loitering: Staying in the same area for extended periods or pacing back and forth

Labeling Elements:
- Behavior type
- Involved person IDs
- Start and end times
- Location (area annotation)
- Severity level (minor/moderate/severe)

Strategy 4: Quality Control

Face Labeling Quality Checks:

Checklist:
□ Does the bounding box fully contain the face?
□ Is the bounding box too large (containing excessive background)?
□ Are keypoint positions accurate?
□ Are occlusion markers correct?
□ Are attribute labels reasonable?

Quality Metrics:
- Bounding box IoU > 0.9
- Keypoint error < 3 pixels
- Attribute accuracy > 95%

Pose Labeling Quality Checks:

Checklist:
□ Are keypoints at correct anatomical positions?
□ Are skeletal connections reasonable (no crossing, no abnormal lengths)?
□ Are visibility labels correct?
□ Are multi-person scene IDs correctly assigned?

Quality Metrics:
- Keypoint error < 5 pixels
- Reasonable skeletal length ratios
- ID consistency > 99%

Behavior Labeling Quality Checks:

Checklist:
□ Is the action classification correct?
□ Are temporal boundaries accurate?
□ Are there any missed actions?
□ Are multi-person actions correctly associated?

Quality Metrics:
- Classification accuracy > 95%
- Temporal boundary error < 0.5 seconds
- Miss rate < 3%

📊 Real-World Case Studies

Case 1: Smart Campus Facial Access Control System

Project Background: A technology campus needed to deploy a facial recognition access control system supporting rapid passage for 10,000+ employees, requiring recognition accuracy >99.5% and passage speed <1 second.

Data Requirements:

  • Collect multi-angle, multi-lighting facial images for each person
  • Handle occlusion scenarios such as wearing masks and glasses
  • Distinguish between real persons and photo/video spoofing attacks

Labeling Plan:

Phase 1: Basic Face Data (2 weeks)

Collection specifications:

  • 20 photos per person
  • Angles: Front, left 15°, right 15°, downward 15°, upward 15°
  • Lighting: Normal, bright, backlit, side-lit
  • Expressions: Neutral, smiling

Labeling content:

  • Face bounding boxes
  • 5-point landmarks
  • Person IDs
  • Collection condition tags

Phase 2: Occlusion Data (1 week)

Collection specifications:

  • Wearing masks (surgical masks, N95 masks)
  • Wearing glasses (regular glasses, sunglasses)
  • Wearing hats (baseball caps, beanies)
  • Combined occlusions

Labeling content:

  • Face bounding boxes (including occluding objects)
  • Visible keypoints
  • Occlusion type and degree
  • Person IDs

Phase 3: Liveness Detection Data (1 week)

Collection specifications:

  • Real person videos (blinking, head turning, mouth opening)
  • Attack samples (photos, videos, 3D masks)

Labeling content:

  • Real/attack labels
  • Attack type
  • Action sequence annotation

Advantages of Using TjMakeBot:

  • AI automatically detects face positions; manual work only requires fine-tuning
  • Batch import of personnel information with automatic ID association
  • Supports frame-by-frame video labeling with consistent ID tracking

Project Results:

  • Labeled data volume: 200,000+ images
  • Labeling accuracy: 99.2%
  • Model recognition accuracy: 99.7%
  • Liveness detection accuracy: 99.5%

Case 2: Shopping Mall Customer Behavior Analysis

Project Background: A chain of shopping malls wanted to use AI to analyze customer behavior, optimizing store layouts and marketing strategies. The system needed to identify customer walking paths, dwell times, and interaction behaviors.

Labeling Tasks:

Task 1: Person Detection and Tracking

  • Detect all customers in the frame
  • Track the same customer across cameras
  • Record walking trajectories

Task 2: Pose Estimation

  • Label 17 body keypoints
  • Used to analyze customer postures (standing, bending, squatting, etc.)

Task 3: Behavior Recognition

  • Browsing: Stopping in front of shelves to look
  • Picking up: Taking products from shelves
  • Putting back: Returning products to shelves
  • Trying: Trying on/testing products
  • Talking: Conversing with staff or companions
  • Checkout: Paying at the register

Labeling Workflow:

Step 1: Video Preprocessing
- Split surveillance video by hour
- Filter valid segments (with customer activity)
- Standardize video format and resolution

Step 2: Person Detection Labeling
- Use AI pre-labeling for body bounding boxes
- Manual review and correction
- Assign tracking IDs

Step 3: Pose Labeling
- Perform pose labeling on keyframes
- Use interpolation algorithms to generate intermediate frames
- Manual inspection of anomalous frames

Step 4: Behavior Labeling
- Label each customer's behavior sequence
- Record behavior start and end times
- Label the area where behavior occurs

Step 5: Quality Review
- Cross-validate labeling consistency
- Expert review of anomalous samples
- Generate quality reports

Project Results:

  • Labeled video duration: 500+ hours
  • Labeled person count: 100,000+
  • Behavior annotations: 50,000+
  • Behavior recognition accuracy: 91.3%

Business Value:

  • Identified popular and underperforming areas
  • Optimized product placement
  • Identified high-value customer behavior patterns
  • Increased conversion rate by 15%

Case 3: Campus Safety Abnormal Behavior Detection

Project Background: A city's education bureau deployed AI safety monitoring systems across all primary and secondary schools, requiring real-time detection of abnormal behaviors on campus, including fighting, falling, climbing over walls, etc.

Core Challenges:

  • Extremely scarce abnormal behavior samples (normal:abnormal > 1000:1)
  • Fast response required (detection latency <3 seconds)
  • Very low false positive rate required (to avoid frequent false alarms)

Data Strategy:

1. Normal Behavior Data

  • Source: Daily surveillance footage
  • Scale: 10,000+ hours
  • Labeling: Sampled labeling, 10 minutes extracted per hour

2. Abnormal Behavior Data

  • Source: Historical incident footage + simulated drills
  • Scale: 500+ hours
  • Labeling: Full detailed labeling

3. Data Augmentation

  • Augment abnormal behavior videos
  • Time stretching/compression
  • Mirror flipping
  • Brightness/contrast adjustment

Abnormal Behavior Labeling Standards:

Fighting:
- Definition: Physical conflict between two or more people
- Features: Pushing, punching, kicking, grappling
- Labeling: Involved persons, start/end times, severity level

Falling:
- Definition: Person suddenly falls from standing position
- Features: Loss of balance, rapid descent, remaining on the ground
- Labeling: Fallen person, fall time, whether they got up on their own

Climbing Over:
- Definition: Crossing walls, fences, or other barriers
- Features: Climbing, straddling, jumping
- Labeling: Person climbing, location, direction

Gathering:
- Definition: Multiple people abnormally gathering in the same area
- Features: More than 5 people, duration >3 minutes
- Labeling: Gathering area, number of people, duration

Project Results:

  • Schools covered: 200+
  • Labeled video: 2,000+ hours
  • Abnormal detection accuracy: 94.5%
  • False positive rate: <2%
  • Average response time: 1.8 seconds

🛠️ TjMakeBot Security Labeling Features

Face Labeling Tools

Automatic Face Detection:

  • AI automatically locates all faces in an image
  • Supports simultaneous multi-face detection
  • Automatically generates bounding boxes

Landmark Labeling:

  • Supports 5-point, 68-point, and 106-point schemes
  • Smart snapping feature for improved labeling precision
  • Batch copy landmark templates

Attribute Labeling:

  • Preset attribute options for quick selection
  • Supports custom attributes
  • Batch attribute modification

Body Pose Tools

Skeleton Labeling:

  • Visualized skeletal connections
  • Drag-and-drop keypoint adjustment
  • Automatic detection of abnormal poses

Video Tracking:

  • Automatic tracking of the same person
  • ID management and switching
  • Trajectory visualization

Behavior Labeling Tools

Timeline Labeling:

  • Visual timeline
  • Drag to adjust temporal boundaries
  • Multi-track parallel labeling

Action Templates:

  • Preset common action types
  • Keyboard shortcuts for quick labeling
  • Supports custom actions

Privacy Protection

Data Anonymization:

  • Automatic face blurring
  • Sensitive area masking
  • Metadata cleaning

Access Control:

  • Tiered permission management
  • Operation log recording
  • Encrypted data storage

⚖️ Privacy and Compliance Considerations

Data Collection Compliance

Informed Consent:

  • Post clear notices in data collection areas
  • Obtain explicit consent from data subjects
  • Provide opt-out mechanisms

Minimum Necessity Principle:

  • Collect only necessary data
  • Limit data retention periods
  • Regularly clean expired data

Data Processing Compliance

Data Anonymization:

  • Decouple training data from personal identities
  • Use anonymous IDs instead of real identities
  • Apply fuzzy processing to sensitive attributes

Access Control:

  • Strict permission management
  • Traceable operation records
  • Regular security audits

Model Deployment Compliance

Usage Restrictions:

  • Clearly define the scope of AI system usage
  • Prohibit unauthorized uses
  • Establish abuse reporting mechanisms

Transparency:

  • Disclose AI system capabilities and limitations
  • Provide manual review channels
  • Accept regulatory authority inspections

💬 Conclusion

Security surveillance AI is a field where technology and ethics carry equal weight. High-quality data labeling is the foundation for building reliable AI systems, while compliant data processing is the prerequisite for earning public trust.

Key Takeaways:

  1. Face labeling: Precise bounding boxes, accurate keypoints, reasonable attribute classification
  2. Pose labeling: Standardized keypoint definitions, correct visibility labeling, consistent ID management
  3. Behavior labeling: Clear action definitions, accurate temporal boundaries, complete event records
  4. Quality control: Multi-level review, cross-validation, continuous improvement
  5. Privacy compliance: Informed consent, data anonymization, access control

TjMakeBot provides professional tool support for security AI labeling — from face detection to behavior recognition, from single-frame labeling to video tracking — helping you efficiently build security datasets while ensuring data processing compliance.

Let AI safeguard security, starting with responsible data labeling!


Try TjMakeBot for Free for Security Labeling →


Keywords: security AI, face recognition, behavior recognition, pose estimation, surveillance analysis, intelligent security, TjMakeBot