Skip to main content
TjMakeBot Blogtjmakebot.com

Smart Home AI: Hands-On Object Recognition Labeling for Home Scenarios

TjMakeBot TeamIndustry Applications15 min
Industry ApplicationsSmart Home
Smart Home AI: Hands-On Object Recognition Labeling for Home Scenarios

🏠 Introduction: AI Enters Every Household

Smart homes are no longer a scene from science fiction movies — they have become a real part of our daily lives. According to IDC's China Smart Home Device Market Quarterly Tracker, the Chinese smart home device market is expected to ship 260 million units in 2025, with a market size exceeding 180 billion RMB. This figure represents a 15.8% increase over 2024, demonstrating the industry's strong growth momentum.

When we come home each day, smart door locks open for us via facial recognition; in the evening, smart lighting systems automatically adjust brightness based on our habits; in the morning, smart speakers play customized news briefings — behind these seemingly simple interactions lies a key technology: object recognition.

Object recognition technology has permeated every aspect of smart home applications:

Robot Vacuums: Need to precisely identify fixed obstacles like sofa legs, table legs, carpet edges, and charging docks, while also avoiding temporarily placed items like shoes, toys, and cables. In several products I personally tested, robots with advanced object recognition could accurately distinguish between "paper scraps to be swept" and "socks that shouldn't be vacuumed," greatly improving the user experience.

Smart Cameras: Modern home security systems no longer just record video — they can analyze frame content in real time. When a camera recognizes a family member arriving home, a delivery person at the door, or a stranger lingering outside, it sends corresponding notifications. More advanced applications include detecting whether pets are causing chaos at home or whether elderly family members are at risk of falling.

Smart Refrigerators: Through built-in cameras and object recognition algorithms, smart refrigerators can automatically record the types and quantities of stored ingredients, remind users which foods are about to expire, and provide recipe suggestions based on available ingredients. Some high-end models can even assess fruit ripeness and suggest optimal consumption times.

Smart Wardrobes: Not only can they identify different types of clothing (shirts, jackets, pants, etc.), but they can also analyze color coordination and recommend suitable outfit combinations based on weather forecasts and user schedules. For busy professionals who travel frequently, this feature greatly simplifies wardrobe decisions.

However, object recognition in home scenarios faces unique technical challenges:

Vast and Rapidly Changing Item Variety: A typical household may contain hundreds of different items, and with the popularity of online shopping, new items constantly enter the home environment. Today's shopping bags, newly purchased decorations, and gifts from friends may all be objects the AI model has never seen before.

Highly Irregular Placement: Unlike factory production lines or store shelves, items in home environments have no fixed placement patterns. Glasses might be on the nightstand, the sofa, the kitchen counter, or even on top of the refrigerator. This randomness poses enormous recognition challenges for AI models.

Extremely Complex Lighting Conditions: From soft morning light to intense midday sun to warm evening lamplight, lighting variations in home environments are more complex than in any other scenario. Additionally, different lamp brands and wall materials with varying reflective properties significantly affect object recognition.

Strict Privacy Protection Requirements: The home is the most private space, and any technology application involving home environments must prioritize user privacy. This applies not only to data transmission and storage security but also requires appropriate handling of sensitive information during the labeling process.

This article takes a deep dive into object recognition labeling methods for smart home scenarios, combining real project experience to provide you with a complete, practical labeling guide for building high-quality home AI datasets.

🎯 Unique Characteristics of Home Scenarios

1. Item Category System

When building smart home object recognition models, establishing a scientifically sound category system is the key to success. Through multiple project implementations, we have found that the following classification approach aligns with human cognitive habits and effectively improves model performance.

Detailed Home Item Classification:

Furniture: This is the most complex category in home environments, not only because of the wide variety but also due to the enormous differences in shape, material, and color.

  • Seating: Includes sofas, armchairs, dining chairs, stools, recliners, bar stools, etc. Note that even the same type of chair can appear completely different under different lighting and angles. For example, a black leather chair may appear very bright under strong light but become almost a silhouette when backlit.
  • Tables: Dining tables, coffee tables, side tables, desks, vanity tables, nightstands, etc. Labeling these items requires special attention to the influence of items placed on the surface. When a table has other objects on it, the table's outline may become blurred, requiring accurate labeling of the table's own boundaries.
  • Storage: Wardrobes, bookshelves, shoe cabinets, storage cabinets, drawer units, etc. These furniture pieces are typically large, and their contents are not visible. When labeling, distinguish between the furniture itself and the items inside.
  • Bedding: Beds, mattresses, pillows, headboards, etc. Bedding arrangements vary widely and change seasonally, requiring consideration of different seasonal bedding configurations.

Appliances: These items are characterized by clear boundaries and defined functions, but significant differences exist between brands and models.

  • Major Appliances: Refrigerators, washing machines, dryers, TVs, indoor AC units, etc. These devices are usually in fixed positions, but environmental changes around them (such as food near the refrigerator or cleaning supplies on the washing machine) can affect recognition.
  • Small Appliances: Microwaves, ovens, coffee makers, rice cookers, juicers, air purifiers, humidifiers, etc. These devices are typically placed on countertops and easily form composite scenes with other items.
  • Personal Care Appliances: Hair dryers, electric toothbrushes, razors, curling irons, etc. These items are usually small but vary greatly in shape, requiring high-precision labeling.

Daily Items: This is the most challenging category due to the extreme diversity of items with varying shapes, materials, and purposes.

  • Tableware: Bowls, plates, cups, chopsticks, forks, knives, etc. These items have similar shapes but different purposes and often appear in combinations (such as a complete table setting).
  • Cleaning Supplies: Brooms, mops, trash cans, cleaning agents, rags, wash basins, etc. These items are typically distributed across different rooms with highly variable usage frequency and placement.
  • Personal Items: Toothbrushes, towels, shampoo, skincare products, cosmetics, etc. These items are usually small, easily confused, and placed very differently across households.

2. Scene Characteristics

Room Types and Their Features:

Living Room: As the center of family activity, living rooms are typically spacious with many items and frequent human activity.

  • Main Items: Sofas, coffee tables, TV stands, TVs, audio equipment, decorations, curtains, etc.
  • Characteristics: Large lighting variations (natural light during the day, artificial lighting at night), relatively fixed item placement with occasional temporary changes (such as visitors' bags, children's toys, etc.).
  • Labeling Focus: Pay special attention to labeling temporary items and the layered relationships of items placed on furniture.

Bedroom: Highly private, with large lighting variation ranges and relatively fixed items.

  • Main Items: Beds, nightstands, wardrobes, vanity tables, table lamps, curtains, etc.
  • Characteristics: Extreme lighting differences between morning and evening, potentially only dim light sources at night, with possible human activity.
  • Labeling Focus: Consider the impact of lighting changes at different times on object appearance, especially changes in bedding.

Kitchen: Dense items, diverse materials, complex environment.

  • Main Items: Cabinets, refrigerators, stoves, range hoods, sinks, various kitchen utensils, spice bottles, etc.
  • Characteristics: Oil smoke, steam, and humidity can affect image quality; items are densely placed and frequently rearranged.
  • Labeling Focus: Pay special attention to precise labeling of safety-related items (such as knives) and careful handling of fragile items.

Bathroom: Relatively small space, high humidity, many reflective surfaces.

  • Main Items: Toilets, sinks, mirrors, shower facilities, towel racks, toiletries, etc.
  • Characteristics: Mirror reflections, water stains, and wet surfaces pose special requirements for recognition algorithms.
  • Labeling Focus: Handle the impact of reflections and water stains on object recognition, as well as item safety in wet environments.

Study/Home Office: Neatly arranged items, strong functionality, high lighting requirements.

  • Main Items: Desks, bookshelves, computers, printers, folders, stationery, etc.
  • Characteristics: Relatively organized item placement, but numerous small stationery types that are easily confused.
  • Labeling Focus: Accurately distinguish items with similar functions but different purposes (such as different types of pens, rulers, etc.).

Lighting Condition Complexity:

Natural Light Environment: Natural light variation in home environments is one of the biggest challenges.

  • Morning Light: Warm color temperature (~3000K), strong directional light, typically entering through east-facing windows, creating distinct shadows.
  • Midday Light: Neutral color temperature (~5000-6500K), highest light intensity, may produce strong contrast.
  • Dusk Light: Color temperature shifts warm again, soft light but still directional.
  • Overcast: Cool color temperature, even but overall dim lighting, lacking depth.

Artificial Light Environment: Modern home lighting systems are increasingly complex, requiring consideration of multiple light source overlay effects.

  • Main Lighting: Usually ceiling or pendant lights providing base illumination, but may create shadows in certain areas.
  • Supplementary Lighting: Table lamps, floor lamps, wall sconces, etc., for localized lighting that may produce additional light and shadow effects.
  • Task Lighting: Kitchen counter lights, vanity mirror lights, bedside lamps, etc., targeting specific functional areas.

Mixed Light Environment: The blend of indoor and outdoor light during daytime is the norm in home environments and the situation requiring the most attention during labeling.

  • Near Windows: Natural and artificial light mix, potentially producing complex light and shadow effects.
  • Backlighting: When the main light source is behind, foreground items may become silhouettes.
  • Color Temperature Mixing: Overlapping light sources with different color temperatures can produce unexpected color effects.

3. Labeling Challenges and Solutions

Item Occlusion Issues: This is the most common challenge in home environments, requiring detailed handling rules.

Tiered Occlusion Labeling:

  • Fully Visible (0-10% occlusion): Object is completely visible with clear boundaries; label normally.
  • Slight Occlusion (10-25% occlusion): Object body is visible with minor parts occluded; category can still be accurately determined; label by actual visible boundaries.
  • Moderate Occlusion (25-50% occlusion): Object is partially occluded but category can still be determined; label the visible part and record occlusion status.
  • Severe Occlusion (50-80% occlusion): Most of the object is occluded; category can be inferred from visible parts only; label cautiously and add confidence markers.
  • Extreme Occlusion (>80% occlusion): Object is almost completely occluded and cannot be accurately identified; generally not labeled.

Common Occlusion Scenario Handling:

  • Furniture-to-furniture occlusion: e.g., a chair partially occluded by a table — label the visible part of the chair and record the occlusion relationship.
  • Item stacking: e.g., stacked books or piled clothing — label each visible layer of items.
  • Human body occlusion: Family members or pets occluding items — label the visible parts of occluded items.
  • Transparent object occlusion: Glass containers, transparent storage boxes, etc. — label items inside the container.

Scale Variation Handling: The same object appears vastly different at different distances.

Distance-Layered Labeling:

  • Close range (1-2m): Object occupies most of the image with clearly visible details; fine features can be identified.
  • Medium range (2-4m): Object is clearly distinguishable with main features visible; suitable for standard recognition tasks.
  • Far range (4-8m): Object is smaller but category can still be identified; mainly used for detection rather than recognition.
  • Very far range (>8m): Object is very small; only rough category determination possible; may require special handling.

Viewing Angle Diversity Challenge: AI devices in homes may observe the same scene from different angles.

Viewing Angle Classification Labeling:

  • Front view: Object faces the camera with the most complete features; easiest to recognize.
  • Side view: Object's side faces the camera; may lack frontal features; requires specialized training.
  • Top-down view: Common for robot vacuums and similar devices; shows the object's top; requires dedicated training data.
  • Bottom-up view: Less common but possible with certain camera mounting positions.
  • Oblique view: Between front and side views; the most common non-standard viewing angle.

Practical Labeling Experience Summary: Through analysis of thousands of home scene images, we found that objects at certain angles are particularly prone to misidentification. For example, a chair viewed from the side may be mistaken for other furniture, and a cup viewed from above may be confused with other circular objects. Therefore, special attention to these edge cases is needed during labeling.

💡 Labeling Strategies and Methods

Strategy 1: Establishing a Standardized Category System

In smart home object recognition projects, establishing a scientific, practical category system is the foundation for success. Through multiple project implementations, we have summarized an effective set of category definition principles.

Core Principles for Category Definition:

1. Function-Oriented Principle: Classify items by their primary function, which aligns better with human cognitive habits and helps model learning. For example, items with seating functionality — whether a living room sofa, dining chair, or office chair — can all be grouped under "Seating." This classification approach enables the model to better understand the essential purpose of items.

2. Granularity Balance Principle: Overly fine-grained categories lead to dramatically increased labeling workload, while overly coarse categories affect recognition precision. In practice, we found the need to strike a balance between efficiency and accuracy. For instance, we don't further subdivide "chair" into "office chair," "dining chair," and "folding chair" because their basic forms and functions are similar. However, for appliances, finer distinctions are needed — "microwave" and "oven" are both kitchen appliances but differ significantly in appearance and function.

3. Mutually Exclusive and Collectively Exhaustive Principle: Each item should clearly belong to one and only one category, while ensuring all possible items have a corresponding classification. This requires thorough consideration of edge cases when designing the category system, with clear assignment rules for items that might span multiple categories.

Best Practices for Category Hierarchy Design:

In our projects, we recommend a three-level classification structure that maintains sufficient detail while remaining manageable:

Level 1 (Major Categories): Typically 8-12 major categories, including furniture, appliances, daily items, decorations, kitchenware, bathroom items, personal care items, office supplies, etc. These major categories essentially cover all item types that may appear in home environments.

Level 2 (Sub-Categories): 3-8 sub-categories under each major category. For example, under "Furniture," sub-categories include "Seating," "Tables," "Storage," "Bedding," etc. Sub-category divisions are primarily based on usage scenarios and basic forms.

Level 3 (Specific Categories): Specific categories under sub-categories, such as "Sofa," "Chair," "Stool" under "Seating." Specific categories should be detailed enough for the model to accurately identify.

Real-World Case: In an object recognition project we developed for a smart home manufacturer, we finalized a system with 112 categories. Early in the project, the client wanted to subdivide "cup" into "water cup," "tea cup," "coffee cup," "mug," etc. However, data analysis revealed that the appearance differences among these cups confused the recognition algorithm, actually reducing overall accuracy. We ultimately unified them into a single "cup" category and improved recognition by increasing training data diversity.

Strategy 2: Handling Occlusion and Truncation

In home environments, item occlusion and truncation are the most common and challenging labeling issues. We need clear labeling rules to ensure data consistency and model robustness.

Tiered Occlusion Labeling:

No Occlusion (0-10%): Object is fully visible with clear boundaries; label normally. These samples form the foundation for model training and should be sufficient in quantity and diversity.

Slight Occlusion (10-25%): Object body is visible with minor parts occluded; category can still be accurately determined. Label by actual visible boundaries but record occlusion status in notes. These samples help the model learn recognition under minor interference.

Moderate Occlusion (25-50%): Object is partially occluded but category can still be determined. Special attention is needed for bounding box drawing — it should include the visible part while reasonably estimating the occluded portion's position. We typically label two boxes: a large box containing the entire object (including occluded parts) and a small box containing only the visible portion.

Severe Occlusion (50-80%): Most of the object is occluded; category can only be inferred from visible parts. These samples require annotators with strong professional judgment; multi-person labeling and expert review may be necessary.

Extreme Occlusion (>80%): Object is almost completely occluded and cannot be accurately identified. Generally not labeled, but if the object is critical to the application scenario (such as hazardous items that robot vacuums need to identify), special handling is required.

Precise Truncation Labeling Rules:

Slight Truncation (<20%): A small portion of the object extends beyond the image boundary; its full boundary can be inferred. Label the complete bounding box representing the object's actual extent.

Moderate Truncation (20-50%): A significant portion of the object is beyond the boundary, but key features are still visible. Label only the visible portion within the image while marking truncation status in metadata.

Severe Truncation (>50%): Most of the object is outside the image; difficult to determine the complete shape. Generally not labeled unless the visible portion has sufficient identifying features.

Truncation Handling Experience: In real projects, we found that truncated object handling needs to be tailored to the application scenario. For robot vacuum applications, even very small truncated objects (such as the bottom of a table leg) need labeling because it relates to navigation safety. For smart speaker visual interaction features, truncated faces are typically not labeled.

Strategy 3: The Importance of Multi-View Labeling

AI devices in home environments may observe the same scene from different angles, making multi-view labeling crucial for improving model generalization.

View Classification and Labeling Standards:

Front View (0°±15°): Object directly faces the camera with clearly visible main features. This is the ideal labeling view and should comprise 30-40% of training data.

Side View (90°±30°): Observing from the object's side, showing side features. This view is very important for distinguishing similarly shaped objects, such as different types of chairs.

Rear View (180°±15°): Observing from behind the object. Although front features are not visible, the shape and texture of the back are equally important.

Top-Down View (-90°): Observing vertically from above, common for robot vacuums and similar devices. Objects present their top contours in this view, requiring dedicated training data.

Bottom-Up View (+90°): Observing from below, relatively rare but possible with certain camera mounting positions.

Oblique View (any angle): Any angle between the standard views above. This is the most common view in home environments and should comprise 40-50% of training data.

Structured Recording of View Information:

In labeling files, we recommend including detailed view information:

{
  "object_id": "obj_001",
  "category": "sofa",
  "bbox": [100, 200, 400, 350],
  "view_angle": {
    "horizontal": 15,  // Horizontal angle relative to front
    "vertical": -5,    // Vertical angle relative to horizontal plane
    "confidence": 0.8  // Confidence of angle estimation
  },
  "occlusion_ratio": 0.15,
  "truncation_ratio": 0.05
}

Multi-View Data Collection Strategies: To obtain rich view data, we typically use the following methods:

  • Use fisheye lenses or multi-camera systems for omnidirectional data collection
  • Set up fixed collection points at different heights and angles
  • Simulate different device perspectives (such as the low-angle view of robot vacuums)
  • Combine manual labeling with synthetic data augmentation

Strategy 4: The Value of Scene Context Labeling

Object recognition in home environments cannot be separated from scene context, as an object's function and meaning often depend on its surroundings.

Scene-Level Labeling Elements:

Room Type Labeling: Each image needs room type annotation — living room, bedroom, kitchen, bathroom, study, etc. This helps the model understand reasonable item distributions.

Lighting Condition Labeling: Label the lighting conditions at the time of capture, including:

  • Natural light/artificial light/mixed light
  • Light intensity level (strong/medium/weak)
  • Color temperature range (warm/neutral/cool)
  • Primary light source direction

Time Information Labeling: Record capture time information, including:

  • Time of day (morning/forenoon/afternoon/evening/late night)
  • Season (spring/summer/autumn/winter)
  • Whether it's a holiday

Spatial Relationship Labeling:

Support Relationship: The "on" relationship is the most common spatial relationship, such as "cup on the table" or "vase on the windowsill." Correctly labeling this relationship helps the model understand physical world constraints.

Proximity Relationship: "next to," "beside," "near," etc. describe relative positions between objects, such as "table lamp beside the nightstand."

Containment Relationship: "inside," "in," etc. describe container-content relationships, such as "clothes in the wardrobe" or "food in the refrigerator."

Functional Relationship: Describes functional connections between objects, such as "coffee maker on the kitchen counter" implying a kitchen scene, or "nightstand beside the bed" reflecting the coordinated relationship between furniture pieces.

Real-World Application of Context Labeling: In a home surveillance system we developed for a smart security company, scene context labeling enabled the model to better understand anomalous situations. For example, when the system detects "person in the kitchen," this is likely normal cooking activity; but if it detects "person active in the bedroom late at night," it may warrant attention. This context-based understanding greatly increased the system's practical value.

Labeling Quality Control: Due to the subjective nature of context labeling, we established a multi-layered quality control mechanism:

  • Developed detailed labeling specifications and examples
  • Implemented multi-person labeling and cross-validation
  • Established expert review processes
  • Conducted regular labeling consistency assessments

📊 Real-World Case Studies

Case 1: Robot Vacuum Obstacle Avoidance

Project Background: A well-known smart home company commissioned us to develop a visual obstacle avoidance system for their next-generation robot vacuum. Traditional infrared and collision sensor-based avoidance solutions had numerous limitations, such as inability to recognize transparent obstacles and unnecessary collisions. The client wanted to use visual recognition technology to enable the robot to intelligently identify and avoid various floor obstacles, including furniture, cables, shoes, and pets, while also recognizing the charging dock and restricted zones.

Project Challenge Analysis:

  • Unique perspective: The robot's camera is only 12cm above the ground, providing an extremely low-angle view completely different from standard object recognition perspectives
  • Complex lighting: The robot moves throughout the room, encountering various lighting conditions including bright window areas, dim under-bed spaces, and strong reflections on tile floors
  • Diverse obstacles: Need to identify obstacles ranging from thin cables to large chairs
  • Real-time requirements: The system must complete recognition and decision-making within tens of milliseconds to avoid collisions

Refined Recognition Target Design: After in-depth discussions with the client and technical evaluation, we refined obstacle categories into the following 18 classes:

Furniture:

  • Chair Leg: Various materials and shapes of chair supports
  • Table Leg: Dining tables, coffee tables, side tables, and other table supports
  • Bed Leg: Bed supports, typically thicker
  • Cabinet Leg: Wardrobe, bookshelf, and other furniture supports

Cables:

  • Power Cord: Charging cables, appliance power cords, etc.
  • Data Cable: USB cables, network cables, etc.
  • String/Rope: Shoelaces, decorative cords, etc.

Wearables:

  • Shoes: Sneakers, slippers, high heels, etc.
  • Clothing: Dropped clothes, towels, etc.

Pets and People:

  • Pet: Cats, dogs, and other household pets
  • Human Feet/Legs: Family members' feet and legs

Miscellaneous:

  • Toys: Children's toys, game controllers, etc.
  • Books: Flat or upright books
  • Cups: Cups of various materials
  • Paper: Newspapers, magazines, documents, etc.

Functional Items:

  • Charging Dock: Robot charging station
  • Threshold: Transition strips between different floor materials
  • Carpet Edge: Border between carpet and hard flooring
  • Hazardous: Sharp objects, liquids, and other items requiring special attention

Data Collection and Processing Strategy: We adopted a diversified data collection strategy:

Real Environment Collection: Data collected in 200+ real homes of different layouts, covering various lighting conditions, floor materials (hardwood, tile, carpet), and furniture styles.

Simulated Environment Collection: Various extreme situations simulated in laboratory settings, such as strong light reflections, shadow occlusion, and complex textured floors.

Dynamic Scene Collection: Simulated daily household activities, such as scattered toys from children playing, pet activity, and housework.

Labeling Standards and Quality Control: We developed strict labeling standards for the robot vacuum's special requirements:

Bounding Box Precision: For elongated objects like cables, each small segment must be precisely labeled to avoid omissions; for large objects like furniture legs, bounding boxes must tightly fit the object edges, avoiding being too large or too small.

Distance Perception Labeling: Additional distance information was added to labeling files to help the model understand actual object sizes and distances.

Occlusion Handling: Detailed occlusion labeling rules were developed, especially for semi-transparent objects (such as glass cups) and reflective objects (such as metal items).

Labeling Team Training: A professional labeling team of 20 people was assembled and underwent one week of specialized training to ensure consistent identification standards across all obstacle types.

Project Implementation Process: Phase 1 (Pre-labeling): Used a general detection model for pre-labeling, achieving ~60% accuracy and significantly reducing manual labeling workload.

Phase 2 (Manual Refinement): Annotators checked and corrected pre-labeling results one by one, with special focus on small objects and occluded objects.

Phase 3 (Expert Review): Senior algorithm engineers conducted sampling reviews of labeling results to ensure quality.

Phase 4 (Model Training Feedback): Applied the initially trained model to labeling data, identified difficult samples, and performed secondary labeling.

Project Results and Validation: After 6 months of effort, the project achieved remarkable results:

Dataset Scale: 65,000 high-quality images labeled, containing over 300,000 bounding boxes across 18 obstacle categories.

Detection Performance: On the test set, the mean Average Precision (mAP) across all obstacle categories reached 93.7%, with large objects (such as furniture legs) exceeding 97% and small objects (such as cables, shoes) reaching 89.2%.

Real-World Application: In real home environments, the robot's obstacle avoidance success rate improved from 78% to 96.5%, collision rate decreased by 85%, and user satisfaction improved significantly.

Technical Breakthroughs: During the project, we also developed specialized data augmentation methods and loss functions for low-angle views, which were later applied to other similar projects.

Case 2: Smart Refrigerator Ingredient Recognition

Project Background: A leading domestic appliance manufacturer wanted to integrate AI ingredient management into their premium refrigerator products. Traditional weight sensing and barcode scanning methods couldn't meet user needs. The client wanted to use built-in cameras to automatically identify ingredient types, quantities, and freshness levels, providing intelligent ingredient management, expiration reminders, and recipe recommendation services.

In-Depth Business Requirements Analysis:

  • Ingredient recognition accuracy: Need to accurately identify over 100 common ingredients, including different varieties of vegetables, fruits, meats, dairy products, etc.
  • Packaging adaptability: Ability to recognize transparent, semi-transparent, opaque packaging, and unpackaged ingredients
  • Condition assessment: Not only identify ingredient types but also judge freshness levels
  • Multi-shelf management: Refrigerators typically have multiple shelves; need to identify ingredients on different levels
  • User privacy protection: Ingredient data processed locally only, not uploaded to the cloud

Ingredient Classification System: In collaboration with nutritionists and food scientists, we established a detailed ingredient classification system:

Vegetables (42 classes): Leafy greens: Spinach, bok choy, lettuce, rapeseed, chives, celery, cilantro, etc. Root vegetables: White radish, carrots, potatoes, onions, ginger, garlic, lotus root, etc. Gourds and fruits: Cucumbers, tomatoes, eggplant, bell peppers, winter melon, pumpkin, bitter melon, etc. Mushrooms: Shiitake, enoki, king oyster, wood ear, silver ear, etc.

Fruits (35 classes): Pome fruits: Apples, pears, hawthorn, etc. Stone fruits: Peaches, plums, apricots, cherries, etc. Berries: Strawberries, blueberries, grapes, kiwi, dragon fruit, etc. Citrus: Oranges, tangerines, pomelo, lemons, etc. Tropical fruits: Bananas, mangoes, durian, coconut, etc.

Proteins (25 classes): Meats: Pork (pork belly, tenderloin, ribs, etc.), beef, lamb, chicken, etc. Seafood: Fish (grass carp, common carp, sea bass, etc.), shrimp, crab, shellfish, etc. Eggs: Chicken eggs, duck eggs, quail eggs, etc. Soy products: Tofu, dried tofu, bean curd sticks, etc.

Dairy & Beverages (15 classes): Dairy: Milk, yogurt, cheese, butter, etc. Beverages: Mineral water, juice, carbonated drinks, tea drinks, etc.

Seasonings & Dried Goods (18 classes): Seasonings: Salt, sugar, soy sauce, vinegar, cooking wine, pepper powder, etc. Dried goods: Wood ear, silver ear, dried shiitake, seaweed, kelp, etc.

Special Data Collection Challenges: The refrigerator environment presents unique challenges for ingredient recognition:

Lighting Conditions: Internal LED lighting in refrigerators typically has a cool color tone with uneven illumination — brighter near the light and darker farther away.

Transparency Issues: Glass bottles, transparent plastic containers, and plastic wrap make object boundaries blurry.

Reflection Issues: Refrigerator inner walls, metal containers, and glassware produce reflections that affect recognition.

Temperature Effects: Low-temperature environments may cause lens fogging, affecting image quality.

Packaging Diversity: The same ingredient may come in different packaging — loose, bagged, boxed, vacuum-sealed, etc.

Labeling Strategy and Implementation: We developed specialized labeling strategies for the refrigerator environment:

Multi-Layer Labeling: Label not only the ingredients themselves but also packaging containers, so the model learns to distinguish between ingredients and packaging.

Condition Labeling: Label ingredient freshness levels (fresh, fair, near-expiry, spoiled) to support subsequent shelf-life management.

Occlusion Handling: Ingredients in refrigerators are often stacked, requiring precise occlusion handling.

Quality Control Measures: A four-tier quality control system was established:

  1. Initial Labeling: Trained annotators perform preliminary labeling
  2. Mid-Level Review: Senior annotators review labeling results
  3. Expert Confirmation: Food experts confirm hard-to-distinguish ingredients
  4. Algorithm Validation: Pre-trained models validate labeling results

Project Results and Business Value: After 8 months of development, the smart refrigerator ingredient recognition system launched successfully:

Recognition Accuracy: Overall recognition accuracy reached 92.1%, with common ingredients (such as apples, eggs, milk) exceeding 96%.

User Experience: Users no longer need to manually input ingredient information; the system automatically identifies and manages everything, greatly improving convenience.

Business Returns: Products equipped with this feature saw a 32% year-over-year sales increase, with a user rating of 4.8 out of 5.

Technical Accumulation: Low-light recognition and transparent object recognition technologies developed during the project were subsequently applied to other product lines.

Data Collection:

Collection environment: Inside refrigerator
Lighting: Built-in refrigerator LED lights
Viewing angle: Front view with refrigerator door open
Challenges:
- Item stacking
- Packaging diversity
- Partial occlusion

Labeling Standards:

Labeling content:
- Item bounding boxes
- Item categories
- Packaging status (packaged/unpackaged)
- Freshness level (optional)

Special handling:
- Stacked items: Label each visible item separately
- Transparent packaging: Label the ingredient inside the packaging
- Partially visible: Label the visible portion

Project Results:

  • Data volume: 30,000 images
  • Labeling categories: 50 classes
  • Recognition accuracy: 91.5%
  • User satisfaction: 4.5/5.0

Case 3: Home Security Person Identification

Project Background: A security company developing a smart home camera needed to identify family members, visitors, and suspicious persons.

Recognition Targets:

Person categories:
- Family members (requires facial recognition)
- Visitors (strangers)
- Delivery personnel (in uniform)
- Suspicious persons (abnormal behavior)

Behavior recognition:
- Normal activities (walking, sitting, standing)
- Suspicious behaviors (loitering, peeping, rummaging)

Privacy Protection:

Data anonymization:
- Face blurring (training data)
- Sensitive area masking
- Encrypted data storage

Compliance requirements:
- Obtain user informed consent
- Prioritize local data processing
- Regularly delete historical data

Labeling Standards:

Person labeling:
- Full-body bounding boxes
- Body keypoints (optional)
- Behavior labels

Privacy handling:
- Apply face blurring after labeling is complete
- Retain body posture information
- Remove identifiable features

Project Results:

  • Data volume: 100,000 images
  • Person detection accuracy: 96.8%
  • Behavior recognition accuracy: 89.2%
  • False positive rate: <2%

🛠️ TjMakeBot Smart Home Labeling Features in Detail

In smart home object recognition projects, choosing the right labeling tool is critical to project success. TjMakeBot, as a professional AI data labeling platform, offers deep optimization and rich functionality for the smart home domain.

Scene-Based Labeling Templates

Preset Category System: TjMakeBot includes a smart home category system validated through multiple real projects, covering 150+ common household item categories. These categories are not simply listed but carefully designed based on actual application scenarios:

  • Furniture Category Template: Contains 50+ common furniture items including sofas, chairs, beds, wardrobes, and desks, each with detailed labeling specifications and examples
  • Appliance Category Template: Covers TVs, refrigerators, washing machines, microwaves, and other home appliances, plus various small appliances, with support for brand and model subdivision
  • Daily Items Category Template: From tableware and toiletries to decorations, covering household essentials
  • Customizable Extensions: Allows users to add project-specific categories with support for category inheritance and attribute extension

Room Scene Templates: The platform provides specialized labeling templates for different room characteristics:

  • Living Room Scene: Highlights sofas, coffee tables, TVs, and other main furniture; optimized for human activity and entertainment device recognition
  • Bedroom Scene: Focuses on bedding, wardrobes, vanity tables, etc.; supports sleep monitoring-related item labeling
  • Kitchen Scene: Emphasizes precise labeling of kitchenware, ingredients, and appliances; supports food safety-related features
  • Bathroom Scene: Optimized for wet environments and privacy protection needs
  • Study/Office Scene: Highlights stationery, electronic devices, and other office supplies recognition

Intelligent Labeling Assistance

AI Auto-Recognition: TjMakeBot integrates advanced computer vision models that can automatically identify objects in images and generate preliminary labels:

  • Real-Time Pre-Labeling: After uploading images, the system automatically identifies and generates bounding boxes with accuracy exceeding 70%
  • Voice Command Support: Users can label through natural language commands such as "identify all furniture in the image," "label kitchen appliances," "detect floor obstacles," or "find all red items"
  • Batch Processing: Supports pre-labeling entire datasets, dramatically improving labeling efficiency
  • Incremental Learning: The system continuously optimizes based on user correction feedback, improving recognition accuracy for specific scenarios

Smart Recommendation System: Based on deep learning and scene understanding technology, TjMakeBot can intelligently recommend possible labeling options:

  • Scene-Aware Recommendations: Recommends likely item categories based on the current room type, such as prioritizing kitchenware and ingredients in kitchen scenes
  • Related Item Recommendations: Recommends potentially related items based on already-labeled objects, such as suggesting "coffee cup" and "coffee beans" after labeling a "coffee maker"
  • Context-Aware: Recommends items based on spatial relationships and functional logic, such as recommending tableware in nearby areas after labeling a "dining table"
  • User Habit Learning: The system learns annotator habits and preferences to provide personalized recommendations

Collaborative Labeling Features:

  • Multi-Person Real-Time Collaboration: Supports multiple annotators working on the same dataset simultaneously with automatic result merging
  • Conflict Detection and Resolution: When multiple annotators give different labels for the same object, the system alerts and assists in resolution
  • Progress Synchronization: Real-time progress sync to avoid duplicate work

Privacy Protection and Data Security

Data Anonymization Technology: Considering the privacy sensitivity of home environments, TjMakeBot provides multiple privacy protection features:

  • Automatic Face Blurring: Uses advanced face detection algorithms to automatically identify and blur faces, protecting family member privacy
  • License Plate Detection and Masking: Automatically detects and masks vehicle license plates to prevent personal information leakage
  • Screen Content Detection: Identifies and masks phone, computer, and other device screen content to protect digital privacy
  • Manual Sensitive Area Masking: Users can manually mark areas needing protection; the system permanently masks these areas

Data Security Measures:

  • End-to-End Encryption: All data transmission and storage uses AES-256 encryption
  • Access Permission Control: Role-based permission management ensuring only authorized personnel can access data
  • Audit Log Tracking: Detailed recording of all operations for security auditing
  • Data Localization: Supports local data storage to meet data sovereignty requirements
  • Automatic Data Cleanup: Supports setting data retention periods with automatic secure deletion upon expiry

Compliance Assurance: TjMakeBot strictly complies with major global privacy regulations:

  • Compliant with GDPR (EU General Data Protection Regulation)
  • Compliant with CCPA (California Consumer Privacy Act)
  • Compliant with China's Personal Information Protection Law and related regulations

Efficient Labeling Toolset

Multi-Modal Labeling Support:

  • Image Labeling: Supports JPEG, PNG, TIFF, and other formats
  • Video Labeling: Supports keyframe extraction, trajectory tracking, and more
  • 3D Point Cloud Labeling: Supports LiDAR data for advanced smart home systems
  • Multi-View Labeling: Supports joint labeling of multi-angle images of the same scene

Quality Control System:

  • Cross-Validation: Same image labeled independently by multiple annotators with automatic result comparison
  • Expert Review: Supports expert review workflows to ensure labeling quality
  • Consistency Checks: Automatically detects labeling inconsistencies and alerts
  • Quality Scoring: Generates quality reports for each annotator for management purposes

Automation Features:

  • Batch Operations: Supports batch copy, move, and delete labeling boxes
  • Keyboard Shortcuts: Rich shortcut key settings to boost labeling speed
  • Template Reuse: Save common labeling configurations and apply to new projects with one click
  • Smart Zoom: Automatically adjusts image display ratio for labeling small objects

Through these features, TjMakeBot not only improves labeling efficiency but more importantly ensures labeling quality, providing a reliable data foundation for smart home AI model training.

💬 Conclusion

Smart home AI is profoundly changing our way of life. From simple voice control to complex scene perception, AI technology has become an indispensable part of modern homes. However, behind all of this lies a common foundation — high-quality data labeling. Without precise, comprehensive labeled data, even the most advanced algorithms are castles in the air.

Object recognition in home scenarios indeed faces numerous challenges: vast and constantly changing item variety, highly irregular placement, extremely complex lighting conditions, and strict privacy protection requirements. These challenges mean we cannot simply apply generic labeling methods but need to develop targeted labeling strategies.

In-Depth Summary of Key Points:

  1. Establish a standard category system: This is the foundation of all labeling work. Function-oriented classification aligns better with human cognition, the granularity balance principle ensures efficiency and accuracy are both addressed, and the mutually exclusive and collectively exhaustive property guarantees labeling consistency.

  2. Handle occlusion and truncation: Occlusion is everywhere in home environments. A tiered labeling strategy allows us to adopt appropriate handling methods for different occlusion levels, ensuring both data completeness and labeling efficiency.

  3. Cover multiple viewing angles: Different AI devices have vastly different perspectives — from the top-down view of robot vacuums to the eye-level view of smart cameras. Comprehensive view coverage is key to model generalization.

  4. Label scene context: Isolated object recognition is often insufficient. Scene context information enables AI to better understand the environment and make smarter decisions.

  5. Protect user privacy: The home is the most private space. Data security and privacy protection are not only legal requirements but also the foundation for earning user trust.

Through the two real-world cases we shared, it's clear that successful smart home AI projects require not only technical breakthroughs but also thorough effort in the foundational step of data labeling. Both the robot vacuum obstacle avoidance system and the smart refrigerator ingredient recognition demonstrate the decisive role of high-quality labeled data in AI performance.

Future Outlook: As technology continues to advance, smart home AI will become more intelligent and human-centric. Future labeling work will also move toward greater automation and intelligence. TjMakeBot will continue to optimize our labeling tools, integrating more AI-assisted features to make data labeling more efficient and precise.

Let AI better understand the home, starting with high-quality data labeling! Every precise bounding box, every careful classification, is laying the foundation for a better smart home life.


Try TjMakeBot for Free for Smart Home Labeling →


Keywords: smart home, object recognition, home AI, robot vacuum, smart camera, scene understanding, TjMakeBot, data labeling, AI training data, computer vision