Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements

Montazerian, Mohammad; Leymarie, Frederic Fol

doi:10.3390/sym16010049

Open AccessArticle

Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements

by

Mohammad Montazerian

^*,†

and

Frederic Fol Leymarie

^†

Computing Department, Goldsmiths, University of London, London SE14 6NW, UK

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2024, 16(1), 49; https://doi.org/10.3390/sym16010049

Submission received: 25 August 2023 / Revised: 1 December 2023 / Accepted: 8 December 2023 / Published: 29 December 2023

(This article belongs to the Special Issue Advances in Computer Vision, Pattern Recognition, Machine Learning and Symmetry)

Download

Browse Figures

Versions Notes

Abstract

:

Using a single RGB camera to obtain accurate body dimensions, rather than measuring these manually or via more complex multicamera systems or more expensive 3D scanners, has a high application potential for the apparel industry. We present a system that estimates upper human body measurements using a hybrid set of techniques from both classic computer vision and recent machine learning. The main steps involve (1) using a camera to obtain two views (frontal and side); (2) isolating in the image pair a set of main body parts; (3) improving the image quality; (4) extracting body contours and features from the images of body parts; (5) indicating markers on these images; (6) performing a calibration step; and (7) producing refined final 3D measurements. We favour a unique geometric shape, that of an ellipse, to approximate human body main horizontal cross-sections. We focus on the more challenging parts of the body, i.e., the upper body from the head to the hips, which, we show, can be well represented by varying an ellipse’s eccentricity for each individual. Then, evaluating each fitted ellipse’s perimeter allows us to obtain better results than the current state-of-the-art methods for use in the fashion and online retail industry. In our study, we selected a set of two equations, out of many other possible choices, to best estimate upper human body section circumferences. We experimented with the system on a diverse sample of 78 female participants. The results for the upper human body measurements in comparison to the traditional manual method of tape measurements, when used as a reference, show ±1 cm average differences, which are sufficient for many applications, including online retail.

Keywords:

human body measurements; machine learning; computer vision; fashion technology; ellipse perimeter formula

1. Introduction

In this article, we present the results of our study, which demonstrate the feasibility of designing a simple yet sufficiently accurate and economical system, based on the use of a single camera phone to retrieve useful body measurements with the potential to address the current problems of the industry of online retail and fashion. We seek to obtain a system that offers a number of key advantages over current off-the-shelf available platforms, including simplicity, accuracy, and flexibility.

Simplicity is in relation to our way of representing the human body as a series of adaptable elliptic horizontal slices, which, although not following all body details, is the main shape favoured by fashion designers and well suited for most day-to-day clothing. Accuracy is in terms of the needs in the fashion sector to achieve results at least similar to the main traditional measurements taken manually with tape. Flexibility is in terms of designing a system that, while remaining simple in its concepts, can evolve with the progress made in technologies, such as those provided by machine learning.

Underlying these goals, we implicitly take advantage of the symmetry of the human body, with respect to a normal standing pose. In our work, the human body is represented by a main vertical axis going through its centre from a top highest (approximated) point at the tip of the head. Limbs (arms and legs) are assumed to be symmetric and ellipses only need be fit to one (arm or leg). However, while limbs are very well approximated by elliptic slices (again using a central approximate skeletal axis for each limb), we focus in this work on the more challenging parts of the human body: from the head down to the hips. With such a grounding in symmetry, we will describe a practical system that combines recent machine learning for object and body part localisation in images with more traditional computer vision techniques, to refine our results and a selection of the best equations to calculate the perimeters of fitted ellipses as a function of an ellipse’s eccentricity.

Article’s Organisation

Our background section (Section 2) covers five related topics: (i) we first discuss traditional measurement approaches used in the fashion sector; (ii) we then consider contemporary computerised techniques; (iii) we also briefly consider recent cutting-edge deep learning approaches; (iv) we review the state of the fashion and online retail industry; and (v) we consider challenges and motivations for enhancing existing body sizing apps. In the next section (Section 3), we explain our methodology, through various steps, starting from image capture and ending with the computation of upper body measurements through the application of markers. We then present our main research findings (Section 4), followed by our final conclusions (Section 5).

2. Background

2.1. Traditional Measurements

Traditional anthropometric measurement extraction methods can be divided into two types of methods: landmark-based and template-based. Landmarks are usually located at or near the joints or other identifiable body features. Based on such landmarks, various body measurement standards, including by the ISO (International Organization for Standardization), permit the characterisation of the human body. While these methods can accurately measure the body’s proportions and size, they are time-consuming, requiring manual measurements. In addition, most brands have their own measuring standards, leading to incompatibilities between outputs.

Template-based methods, however, involve measuring specific body proportions using a template, which is a scaled 2D/3D representation of the body based on specific dimensions. The template is placed over a person, and the distance between the template’s points is measured to obtain the measurements. This method evaluates the body’s size, shape, and proportions to determine the correct clothing size. Template-based approaches are quick and reliable. They are also a cost-effective way of measuring the body, saving both time and money. However, the templates represent averages over population groups and do not account for a large variety of body types. Complicated body shapes cannot be well approximated using only templates. Therefore, their use is unsuitable for custom-made clothing, which requires more precise measurements. In addition, they do not allow for the individual’s preferences and do not permit design flexibility.

While both types of methods are still used in the fashion sector, there are other techniques available to overcome the growing problem of wrong fits for online shopping. Recent techniques include 3D body scanning, virtual try-on systems, and systems exploiting recent developments in artificial intelligence (AI). Such techniques, which we briefly survey next, can potentially provide more accurate and precise measurements, allowing for a better fit and more personalised clothing options.

2.2. Recent Computerised Methods

To overcome the problem of wrong fits for online shopping [1], many research projects have been conducted, and one of the most promising solutions is to directly estimate the human body measurements from 2D images. This typically requires users to wear very tight clothes or just underwear. Another group of studies use more complex depth-sensing systems to obtain 3D images, which are then used to estimate body measurements. Below, we present a brief summary of recent research efforts that focus on estimating the human body from outer surface measurements.

Xiaohuia et al. [2] developed a technique to automatically extract feature points and measure clothing on 3D human bodies. The method requires a depth-sensing platform, which is currently unavailable to the majority of online customers. Additionally, numerous poses must be captured. Chang et al. [3] presented a dynamic fitting room that allows users to view a real-time image of themselves while trying on various digital garments. The method employs the Microsoft Kinect (Microsoft Corporation, Washington, DC, USA) sensing platform and augmented reality technologies. The system calculates the user’s body height based on head to foot loci and depth data collected by two Kinect cameras, for front and side views. The system produces results that are sufficiently accurate (less than a centimetre error on average).

Chandra et al. [4] developed an approach that aims to determine human body measurements using portable cameras, such as with smartphones. When the system recognises a human body in its viewing field, it tracks the body’s outline. To achieve this, the system searches for the face, neck, and shoulders, which are taken as markers for the upper body. However, there are some constraints, as users must set up certain environmental parameters (including the light, background, and type and contrast of clothes) and must perform the measurement capture at least five times. In a related study, Ashmawi et al. [5] introduced an approach for estimating human body measurements using smartphone cameras. Their method employs Haar-based detectors to identify key markers for the upper, lower, and full body regions. These markers are then utilised to calculate body measurements. However, it is important to note that their approach can only provide relatively broad size categories—such as those commonly used in retail classification in “small, medium, large, XL, XXL”. The level of accuracy achieved by their method is not sufficient for our goals.

2.3. Deep Learning Methods

Choudhary et al. [6] introduced MeasureNet, a neural network model for predicting body circumferences and waist-to-hip ratios (WHRs) from colour images taken from three angles: front, side, and back. The model’s training relied on a synthetic dataset comprising around one million 3D body shapes, designed to capture the diversity in the body mass index (BMI) and poses within the North American demographics. To ensure realistic parameter sampling, these shapes were fitted to 3500 actual body scans and clustered using a Gaussian Mixture Model. MeasureNet offers the precision of clinical measurements, suitable for home-based self-measurements. The model was fine-tuned with an extensive texture library and validated against ground truth values using metrics like MAE (mean absolute error) and MAPE (mean absolute percentage error). While specific numerical accuracy is not provided, the method offers a notable improvement over self-measurement practices, especially for critical metrics like the WHR, with significant health implications. Its limitations include potential inaccuracies when using normal clothing, a bias towards the North American demographics in the training data, and the necessity to access large reliable datasets for re-training for different demographics.

Danadoni et al. [7,8] created a cloud API utilising Amazon Web Services (AWS) that can be easily integrated into other platforms to extract up to 24 body measurements from two images (front and side views). Existing open-source algorithms are utilised. According to their findings, the median difference is less than 2.6 cm (or one inch), which is insufficient for commercial applications.

Kaashki et al. [9] developed Anet, a deep neural network for automatic anthropometric measurements from 3D human body scans. Anet consists of two components: a feature extraction network and an anthropometric measurement extraction network. Features are recovered from a 3D scan, including the posture, shape, and proportions. The anthropometric measurement extraction network then uses these features to automatically estimate 3D anthropometric measurements. Compared to existing methods, Anet has a mean absolute error of less than one centimetre. This suggests Anet could reduce the cost and time associated with obtaining anthropometric measurements and generate digital human models for virtual reality and computer animation. Such software, however, may be overly complex for practical deployment, due to estimating anthropometric measurements from dense accurate 3D data.

The SHAPY model, by Michael Black et al. [10], provides accurate 3D body shape estimation without explicit 3D shape supervision. The model uses linguistic shape attributes and anthropometric measurements as proxy annotations for training a regressor, enabling accurate body shape predictions. The framework is evaluated on various datasets, including the “Human Bodies in the Wild” (HBW) dataset, which provides natural and varied body shapes and clothing in real-world settings. The HBW dataset is comprised of photographs of individuals in lab settings as well as in the field, accompanied by 3D shape data derived from body scans. SHAPY gives impressive results on the HBW dataset, estimating body measurements such as heights and chest, waist, and hip circumferences, with mean absolute errors ranging between 5.8 mm and 6.2 mm for various gender and attribute combinations. SHAPY’s advantages include its superior performance, its ability to accommodate diverse body shapes and apparel variations, and its potential to reduce the cost and time required to obtain anthropometric measurements. However, limitations exist. SHAPY’s model-agency training dataset may not be representative of the entire human population, which can impact the accuracy of body shape predictions for individuals whose shape is not well represented in the training set. Due to the computational intricacy of estimating anthropometric measurements from dense and accurate 3D data, the framework’s implementation in practice may also prove challenging.

In comparison to the more sophisticated recent deep learning methods, our method aims for simplicity, yet with sufficient accuracy for our application domain. Also, it does not rely on any re-training or access to very large datasets, as we only use deep learning on a pre-trained network for the single purpose of an initial 2D image segmentation to localise and coarsely segment some of the body parts represented by bounding cells, which is useful for fashion measurements.

2.4. Overview of the Current Situation in the Fashion and Retail Industry

The fashion industry ranks as the world’s second most water-intensive sector, consuming approximately 79 billion cubic metres of water annually [11]. Given that 2.7 billion individuals currently face water scarcity, this statistic becomes even more alarming [12].

Online shopping platforms have been attracting a continuously growing number of customers over the years since their introduction at the turn of the millennium. A well-known problem for online cloth retailers is the high return rate due to a poor fit [13,14,15]; e.g., at the start of 2022, returns for USA retailers accounted for over USD 761 billion in lost sales [16]. Customers will often end up ordering the same piece in different sizes to choose the right fit and then sending back the others. This causes significant costs for the retailer and also has a significant negative impact on the environment.

Reducing return rates significantly could be achieved if customers had a convenient method for obtaining accurate anthropometric measurements. It would be useful to establish an improved understanding and communication between retailers and customers. This could be achieved by having a standard definition of anthropometric measurements relevant to selecting the correct garment size, irrespective of the brand. Current methods for accurately analysing the 3D shape of the human body often necessitate expensive equipment, such as laser-based 3D scanners or multiple fixed camera setups. Such equipment is typically not widely accessible, nor easily transportable, and tends to be expensive.

Due to a lack of accuracy, robustness, and ease of use, existing body measurement systems based on affordable consumer hardware (such as smartphones or tablets) are thus far unable to satisfy both consumers and retailers. These technologies rely heavily on user input. This implies that the user must follow specific instructions in order to obtain accurate measurements, such as performing specific movements in front of the camera or holding various poses for a specified amount of time.

Our proposed system is aligned with evolving consumer preferences, such as the Gen Z market’s demand for personalised products and a growing awareness of environmental sustainability in the fashion sector. Our lightweight measurement system, primarily utilising smartphones for data capture, offers a promising solution to these multifaceted challenges while fostering a more sustainable and eco-friendly fashion industry.

2.5. Challenges and Motivations: Existing Body Sizing Applications

In Figure 1, we present the average error in centimetres for different horizontal sections of the human body, as measured by five selected off-the-shelf applications. These applications were chosen based on the criteria of reliability, popularity, availability, and relevance to our research. During data collection, a subset of 30 participants stood in front of the camera, each approximately 10 times to ensure accuracy. We conducted multiple measurements with the five apps, comparing the results to select the most accurate measurement for analysis. Participants were divided into two groups based on gender to account for anatomical differences, and they wore tight-fitting clothing to minimise the influence of loose clothing on the measurements. Note also that these participants were subjected to uniform conditions across the five different applications. “Uniform conditions” refers to ensuring consistent and standardised conditions for measurements across the five different applications that were used in the study. It means that the same parameters, settings, and guidelines were followed while using each application to capture the measurements. By maintaining uniform conditions, the aim is to minimise any potential variations that could arise due to differences in how the applications function or in their default settings. Further details can be found in Appendix A.1.

Based on our research, we aim for an average error of 1 cm or less in the measurements retrieved from our system. This target relates to industry standards and previous studies that have shown that an error of up to 1 cm is acceptable for garment sizing [17,18]. Manual methods for measuring the body, such as tape measurements and manual anthropometry, have been shown to have errors ranging from 1 cm to 3 cm [19,20,21]. Therefore, our target accuracy of 1 cm or less would represent a significant improvement over manual methods and would be suitable for practical use in the fashion industry. Additionally, there can be significant differences between the direct (tape) measurement results and those based on software, especially for upper human body circumferences such as for the chest and waist.

Given these challenges and the potential for improvement, we decided to focus on the upper body—from the hips and up. While capturing accurate measurements for the lower body or the limbs is more easily achievable, concentrating on the upper body allowed us to address the greater variations in body contour specific to this region and explore more accurate and reliable solutions for body measurements, particularly for fashion-related applications. The upper half of the body presents more variations in shape and size, both between individuals and for an individual over time. In contrast, the lower body, particularly below the hips, exhibits less deformations and can be very well approximated using ellipses for cross-sections.

3. Method

There are a number of key steps that are part of our method to process images towards evaluating 3D body measurements. The first important step involves the use of recent ML techniques to process images of the front and side views of a person, to obtain good localisations of a number of subregions that characterise body parts of interest, together called ROIs (regions of interest). Another step involves using a number of classic image processing techniques to refine image properties and recovered body outline segments. Another classic method we use is that of camera calibration, so that we can make accurate measurements (in cm) from image data. We also adapt a colour segmentation scheme to identify skin (of various shades) in body parts.

In terms of a pipeline of steps and processes (Figure 2), (i) once body parts are isolated, using, in our case, a smartphone camera and a deep learning model, we then (ii) improve the image quality, (iii) use the previously isolated ROIs to discard irrelevant or unwanted parts of an image and refine contour outlines of individual body parts, (iv) automatically or interactively set a small number of markers, (v) use calibration to determine the pixel to centimetre (cm) relation, and (vi) by computing the differences between markers combined with using a selected set of ellipse equations, we estimate human upper body circumferences.

As part of the data capture process, users can wear casual clothing but with a few constraints, including that they should wear clothing that is tight enough around the waistline, shoulder line, and neck, so as to not obfuscate their corresponding body outline segments. Also, long hair should not mask shoulder areas and be kept flowing down the back of the user (e.g., by having a pony-tail). Note that this is less restrictive than many other proposed approaches, which require users to be wearing only tight underwear or tightly fitting clothes over the entire body. In terms of poses, our system requires a so-called “A-pose”: typically with arms held straight and hands pointing down, at an angle away from the armpit, chest, and hips. The A-pose is needed in our hybrid system to permit the automatic accurate localisation in the “front view” of markers for the shoulders, bust, waist, and hips.

3.1. Body Parts via ROIs

Given a pair of images (front and side views), we want to directly locate useful anatomical parts—which can provide important measurements for fashion design or good fit finding for online retail needs—including the head, chest, bust, waist, hips, shoulder, neck, and full body height. We achieve this goal by training a deep learning method with specific labelled bounding boxes, which we refer to as regions of interest (ROIs). The purpose of this segmentation into specific meaningful regions is to enable us to speed up with good accuracy the process of obtaining good measurement locations. For our ML module, we selected the MobileNet SSD, pre-trained on a dataset to detect the human body [22]. MobileNet uses depth-wise separable convolutions to build a lightweight deep neural network, which was selected in preference to other object detection methods as it well suited to mobile devices and other embedded applications.

After conducting a thorough evaluation of various methods, including YOLO (v5), OpenPose, and EfficientNet, we concluded that while models like EfficientNet may provide more accurate results in certain scenarios, MobileNet SSD offered a combination of performance, efficiency, and compatibility that made it the preferred choice for our pipeline. When compared to YOLO (v5), MobileNet SSD demonstrated better accuracy and robustness in locating human body parts from complex backgrounds, particularly under challenging lighting conditions and textured backgrounds. Additionally, MobileNet SSD outperformed OpenPose by providing more accurate localisation results, especially in side views and scenarios with difficult lighting conditions. Compared to EfficientNet, MobileNet SSD’s lightweight architecture and efficient inference made it particularly suitable for resource-constrained devices, such as smartphones, without compromising real-time performance. Furthermore, our experiments consistently showed that MobileNet SSD provided better ROI localisation results and improved the efficiency of subsequent image corrections and human body area measurements. We should expect that over time other new or refined ML-based solutions may prove superior to our use of MobileNet SSD. Essentially, then, one would adapt our pipeline by updating the initial body part localisation.

More specifically, we define ROIs by selecting rectangular areas in an image to isolate these for further processing. Such ROIs are used in the training of our ML method. We currently use up to eight specific rectangular regions, which may overlap somewhat, corresponding to meaningful anatomic areas: the (i) head, (ii) neck, (iii) shoulders, (iv) bust, (v) chest, (vi) waist, (vii) hips, and (viii) height (feet to head top). This detailed extraction is essential in our pipeline towards obtaining accurate enough measurements. Currently, the approach involves LabelImg, a graphical image annotation tool (https://pypi.org/project/labelImg (accessed on 1 September 2021)), which we used to label image components and thus identify the presence of desired anatomical regions.

The accuracy of the human body measurement in our system relies on extracting body part outlines from images, despite potentially complex backgrounds. ROIs allow the system to focus on specific areas of interest and disregard irrelevant parts, such that only the pixels within these regions are to be further processed. An example of the automatic extraction of such ROIs is illustrated in Figure 3. Once such ROIs are obtained, we proceed with classic methods from computer vision to recover body outline segments, thus requiring no further ML training.

3.2. Image Corrections and Processing

In our current pipeline, for each body part ROI we apply (i) a metric correction using an affine camera model; (ii) grey-level mapping of an RGB input, together with the removal of small image regions and image smoothing [23,24]; (iii) edge detection [23]; and (iv) mask operations on matrices, and we subsequently perform thresholding as part of the process [25]. Then, we perform a step of skin detection [26].

The metric correction using an affine camera model permits the rectification of distortions caused by factors such as camera angles or lens imperfections. Transforming the image from RGB to greyscale through grey-level mapping aids in highlighting essential features and reducing the complexity of further data processing. Next, we simplify the image further by applying Gaussian smoothing, for its effectiveness in reducing noise and enhancing the overall quality of the image. Edge detection is then applied based on the classic Canny edge detection algorithm. We conduct edge contour tracking via hysteresis to detect body part contour segments by suppressing weak pixels not connected to strong ones as highlighted by the Canny operator.

Regional Skin Detection

We also seek to accurately identify and isolate human skin in our images, also known as skin detection, which can be helpful to further improve the location of body part outline segments, such as in the neck area. Our method involves colour space transformation, ROIs, parameter estimation, and final skin segmentation. We use both the HSV (Hue, Saturation, and Value) and YCbCr (Luminance, Chrominance Blue, and Chrominance Red) colour mappings, as they are known for their effectiveness in skin colour representation, as well as skin tone differentiation.

Specifically, the HSV colour space can capture skin tone variability under various lighting conditions. In this space, hues are relatively consistent across skin tones, with the brightness and saturation revealing insights about shadows and lighting conditions. In contrast, the YCbCr colour space provides distinct clustering of skin tones in the chrominance channels, irrespective of luminance variations. This behaviour helps to distinguish skin pixels from non-skin pixels, making skin detection more accurate across diverse ethnicities.

After transforming the image into the HSV and YCbCr colour spaces, we identify and isolate image areas that resemble human skin. For this purpose, we use specific ROIs for each body part. By carefully analysing the colour distribution and patterns within the HSV and YCbCr channels, we are able to create effective segmentation thresholds for skin for each region and thus separate skin and non-skin areas, such as clothing or hair. Our skin detection method is not limited to the face or hands, as found in many other applications (Figure 4). We note that skin detection has proven invaluable in our efforts to remove hair that may obscure parts of the body, especially in the neck and shoulder areas. Our earlier studies revealed that any hair covering parts of the body could introduce significant inaccuracies in the final measurement estimations.

Regarding the ethnicity-related concern of skin colour variations across the global population, our approach has been designed with inclusivity in mind. The choice of HSV and YCbCr colour spaces enables our system to accommodate the spectrum of human skin tones. Preliminary tests of our algorithm across diverse races have yielded positive outcomes. However, we acknowledge the dataset’s size as a current limitation. Efforts are underway to enhance our dataset, ensuring more extensive coverage and thereby refining the robustness of our system to work effectively across all ethnic backgrounds.

3.3. Selecting Markers

The primary objective of our feature extraction phase is to identify specific points of interest—we call markers—from the detected body outline segments for each body part. The 3D measurements of various horizontal body regions, including the waist, chest, hips, and shoulders, can then be estimated based on these markers. We identify markers in pairs, along horizontal body “slices”. To accomplish this, we first find the best vertical body line to utilise as an approximate mirror-like splitting axis. Then, we search for the right and left pairs of body extremities in each ROI, i.e., pixel locations at the body’s edge.

We first identify a top central head point—head tip—from the “height” ROI, which we use to centre our vertical splitting line. From there, we can walk along a body contour segment, for each ROI, on one side (left or right) and monitor the slope. When this slope goes through a large change in value, such as for the neck and shoulder ROIs, we determine a potential marker location. Alternatively, we select a locus mid-way along a contour segment, which proves useful for other ROIs (see Figure 5).

A collection of measurements from the ISO 8559 [18] standard, which prescribes the location of anthropometric measurements used in the production of physical and digital anthropometric databases, has been selected in order to compare the suggested method with other state-of-the-art approaches.

Currently, in our system, we allow for a semi-automatic method where a user can move the proposed (detected) markers of one ROI either horizontally, along the current line joining markers, or by first moving vertically that same line along the main body axis. This proves useful, especially when the automatic method may fail, such as when the A-pose is too “weak”, e.g., when the arms of the user are kept too close to their chest, such that an armpit is not clearly visible (see Figure 6 and Figure 7).

We note that this step could also be performed automatically in adverse cases, by using an ML-based method; however, it would require additional training data, which we have yet to produce. In contrast, having a semi-automatic method for the selection of markers permits the user to take back some control on the system, which is often seen as a desirable feature, particularly by fashion designers or tailors.

After obtaining our set of marker pairs, we can approximate the circumferences of the upper human body for each horizontal slice by fitting an ellipse to the data points, using both the frontal and side views. The idea of using an ellipse as a useful geometric model of human horizontal body contours has been previously validated [27]. By evaluating the semi-axes of the ellipse from the two images (front and side), the circumference of a human body “slice” with respect to the selected marker pair can be estimated with sufficient accuracy (e.g., waist circumference).

Note that for our domain of application in the fashion sector, using an ellipse-like form tightly fit to a body horizontal contour (slice) is what is needed in most cases: i.e., we do not want a piece of clothing to follow too closely all the body contour details and deformations when considering or designing a pattern for cutting the fabric to be assembled into a wearable piece. For a different application domain, such as the medical realm, recovering detailed contour measurements might be desirable and the ellipse fitting process may only provide an initial approximation.

3.4. Computing the Perimeter of Fitted Ellipses

The real circumferences of human subjects are only approximately elliptical (Figure 8 and Figure 9). The challenge is to discover the best fit. Unlike for a circle, there is no explicit or closed form formula for calculating the precise ellipse circumference. According to our data, the differences between each shape can impact the final estimation. This motivated us to study and better understand how to tackle this issue based on the data acquired (see Section 4.1). We decided to study a series of six equations from the literature to estimate the human body circumference in order to reduce the difference with actual measurements. As can be seen in Figure 8, the body slices can become rounder, almost rectangular (top row), or, at another extreme, become squashed like a pressed oval shape (bottom row).

Figure 8 shows some of the participant shapes for which we collected data during our capture procedure. The initial step involved using PifuHD [28] to transform the photos into 3D models (.fbx files). Subsequently, Autodesk Maya was utilised to separate the specific body parts from these 3D models.

The six equations we considered were subsequently narrowed down to two, as shown below, based on their ability to provide good accuracy levels tailored to different upper body shapes while remaining simple to compute. The detailed comparison and evaluation of these six equations can be found in Appendix A.3. The selected pair of equations are

P \approx 2 \times π \times \sqrt{\frac{(x_{1}^{2}) + (x_{2}^{2})}{2}},

(1)

P \approx π [3 (x_{1} + x_{2}) - \sqrt{(3 x_{1} + x_{2}) (x_{1} + 3 x_{2})}],

(2)

where

x_{1}

and

x_{2}

are the lengths of the semi-major and semi-minor axes.

Our analysis considered the shape variations found among individuals. We conducted a linear regression analysis with the variables

x_{1}

and

x_{2}

. Figure 10 illustrates these variables in relation to waist circumferences. In summary, Equation (1) produces more accurate answers for ellipses with semi-major axes that are no more than 3 times longer than their semi-minor axes; otherwise, Equation (2) provides more accurate results when an ellipse is more squashed (details in Section 4.1).

Given one horizontal slice for a specific body part, we need to convert the pairs of markers (for the front and side views) from pixel values to world coordinates, which necessitates a camera calibration step, which we detail next.

3.5. Camera Calibration

Calibration provides the camera parameters, including the intrinsic 3 × 3 matrix K, the 3 × 3 rotation matrix R, and the 3 × 1 translation vector t, using a set of known 3D points and their corresponding image coordinates. Once calibrated, image pixels can be related to distances or sizes in the physical world (say in centimetres). Calibration parameters are determined by assessing the image size in pixels and the size of the object within the image. In one instance, we leverage the classic use of a simple checkerboard pattern for its orthogonal and regular geometry, which makes it recognisable for both users and computer vision systems. While the checkerboard is effective, its practicality for everyday users may be limited. Recognising this, we have also tested a popular alternative of relying on the user’s height as a calibration reference. Using a reference height necessitates that the camera be facing in a perpendicular orientation to the subject’s body straight on, to help in obtaining a valid calibration. Note that we have reduced the impact of the camera being slightly mis-aligned with respect to the body, by implementing affine and metric corrections and employing two images (rather than just one) to gather depth information. Such corrections help in transforming the image coordinates to world coordinates, compensating for potential perspective distortions. In our experiments, accurate height inputs align well with checkerboard-based measurements. Slight deviations in height inputs, however, will reduce the accuracy significantly. We note, from our investigations, that a high percentage of users (as high as 70%) may not know their height with sufficient precision, introducing potentially significant calibration inaccuracies.

Yet another alternative can be to use other calibration patterns to enhance precision. For example, a practical option is to use largely accessible and standard-sized cards, like credit or bank cards. Additionally, it is worth noting that some existing applications employ a two-step calibration process. They not only rely on the user-provided height but also instruct users to stand within a specific distance from the camera’s point of view. Users are directed to move forward and backward within the application display until they are positioned at their ideal distance from the camera. These distance measurements are then used in conjunction with the user’s height input to convert pixel units to centimetres, further contributing to the overall calibration process. This represents yet another option, although it is less convenient from the user’s perspective.

4. Result and Discussion

To effectively design a machine learning model combined with additional computer vision techniques, we require a dataset of a sufficient size with good annotations (in our case, ROIs with correct labels). Our study necessitates a compilation of human body images and their associated ROIs for chest, bust, waist, and hip circumferences. As there was, at the start of our research, an absence of pre-existing appropriate datasets, we created our own.

Choosing Participant Selection:

The participants in the dataset are volunteers who self-identified as female and are over 18 years old. According to several surveys, women purchase clothing online more than men, and thus we decided to focus our study first on women. In addition, based on our study and experiments, the average inaccuracy of various applications is greater for women than for men (Figure 1). As it is more challenging, we decided to continue our research focusing on women. Thus far, we have collected for training purposes a relatively small dataset of 78 female bodies, representing a variety of body shapes and heights. Nevertheless, our results are promising.

Our approach to selecting participants had two components. The first group of participants consisted of volunteer students from Goldsmiths, University of London. They were actively recruited through the dissemination of posters and invitations sent via email. In appreciation for their time and contribution, these volunteer participants were compensated monetarily. The second group of our dataset consisted of volunteer customers associated with, Nevena Nikolova, an established couture designer operating worldwide. These participants were sourced owing to Ms. Nikolova’s willingness to collaborate and contribute to this research study.

Ground truth:

Using tape measurements, “ground truth” references were obtained from each individual. (Here, “ground truth” corresponds to what fashion designers have traditionally been happy to use for centuries.) The participants were then photographed in various environments with different lighting, backgrounds, body postures, distances from the camera, and clothing types. The photographs were taken with an iPhone 10 (or later) and a Galaxy S21, as these smartphones are representative of the most commonly used ones. In an effort to reduce human error, we have obtained our tape measurements in accordance with the procedures outlined in ISO 8559-1 [18], ISO 8559-2 [29], and ISO 8559-3 [30]. In addition, we received assistance and guidance from Ms. Nikolova, couture fashion designer, in obtaining our tape measures.

The quantitative data for the 78 female participants were analysed using descriptive statistics and a t-test with two samples assuming unequal variances. Data description has been used throughout all phases of our data collection. The dependability of measured data is compared between Equations (1) and (2) using a t-test with two samples assuming unequal variances (Welch’s t-test). Lastly, we also used a t-test to compare our software results to actual tape measures to ensure that we were able to improve the accuracy of the state-of-the-art methods. At each stage, the quantitative data were analysed and compared in terms of the accuracy and reliability of each method and technique.

4.1. Ellipse Model Results

We now present the results of a t-test analysis conducted to compare the accuracy of two ellipse-based formulas (Equations (1) and (2)) for estimating upper human body circumferences. The t-test was chosen to determine if there is a statistically significant difference between the means of the two formulas, which represent the average difference from the tape measurements. Lower mean values indicate higher accuracy.

The sample mean for Equation (1) was 0.7578 cm, and for Equation (2) it was 0.6647 cm. The respective variances for these formulas were 0.2149 and 0.1981. The two-sample t-test yielded a t-statistic of 2.5585 and a degrees of freedom (df) value of 621. The one-tailed p-value was found to be 0.0054, and the corresponding t-critical value for a one-tailed test was 1.6473. The two-tailed p-value was calculated as 0.0107, with a t-critical value of 1.9638 for a two-tailed test.

The t-test results indicate that there is a statistically significant difference between the means of the two ellipse-based formulas for estimating upper human body circumferences. The t statistic (2.5585) is greater than the t critical two-tail value (1.9638), and the two-tailed p-value (0.0107) is lower than the significance level (0.05). These results support the conclusion that the two formulas are significantly different in their performance for estimating upper human body measurements. Since the mean value for Equation (2) (0.6647) is lower than that for Equation (1) (0.7578), it can be inferred that Equation (2) has a higher accuracy level in estimating upper human body circumferences.

Although Equation (2) offers better accuracy than Equation (1) on average, the descriptive statistics revealed that out of the 312 data points collected for human body circumferences (chest, bust, waist, and hips), Equation (1) provided more accurate measurements for approximately 35.9% of the instances, while Equation (2) was more accurate for the remaining 64.1% of the instances. This observation suggests that the accuracy of these equations varies depending on the specific body shape (horizontal slices), with one equation being more accurate than the other in minimising the discrepancy between direct measurements and software-based estimations.

Based on our data collected from the participants, and by best fitting the human body shape (horizontal slices) with ellipses, we observed that for elliptic profiles where their major axis was not more than 3 times longer than their minor axis, the stress level of Equation (2) (M = 0.857, SD = 0.232, n = 112) (M stands for “Mean” (the average), SD stands for “Standard Deviation”, n represents the sample size or number of observations) was hypothesised to be greater than the stress level of Equation (1) (M = 0.456, SD = 0.101, n = 112). This difference proved extremely significant: t(192) =

- 7.361

, p = 5.19981 × 10

^{- 12}

(two-tailed). Statistically, the differences between the groups are significant, as the p-value is significantly lower than 0.05, and we can reject the null hypothesis (Figure 11).

In the following Figure 11 and Figure 12, stress levels for each formula are illustrated on a bell curve. Additionally, the figures display the average variance and standard deviation when comparing measurements of chest, bust, waist, and hip circumferences obtained through direct (tape) measurements and our software.

However, it was hypothesised that when the major axis of the body shape is 3 (or more) times longer than the minor one, the stress level of Equation (1) (M = 0.927, SD = 0.200, n = 200) would be higher than the stress levels of Equation (2) (M = 0.556, SD = 0.147, n = 200). This difference was highly significant: t(389) = 8.917, p = 1.89062 × 10

^{- 17}

(two-tailed). Please refer to the bell curve below to see the data in more detail (Figure 12).

In Figure 13, we summarise the average measurement error differences between our software and tape measurements for 78 female participants. The figure provides key statistical characteristics, including the mean, minimum, maximum, and standard deviation, for each equation according to different body shapes. The data presented were collected using Equations (1) and (2), respectively.

In summary, these results indicate that, as a function of variations in body shape (horizontal slices), one formula is more accurate than others in minimising the difference between direct measurement and software solutions. For body shapes with semi-major axes that are not more than 3 times longer than their semi-minor axes, or in other words, ellipses that are not overly elongated, Equation (1) provides more accurate results, whereas Equation (2) provides more accurate results for body shapes with semi-major axes that are more than 3 times longer than their semi-minor axes (like squashed ellipses). By utilising two ellipse equations based on different body shapes, we were able to reduce the mean (average) inaccuracy to ±1 cm, which is sufficient for a number of real-life applications.

4.2. Final Results

The quantitative data for the 78 female participants were analysed using descriptive statistics. We captured 2D images of each participant based on the following scenarios:

With a plain background;
With a cluttered or textured background;
Using a pair of different body postures (A-pose and relax pose);
Using different distances from the camera;
With clothing that has visible creases;
Under different lighting conditions;
Using devices with different camera specifications:
(a)
Apple devices;
(b)
Samsung devices.

With 78 participants, this has resulted in an initial dataset with 312 data points.

Table 1 displays the average differences between the measurements produced by our technique and those taken using a measuring tape for all 78 participants. Our upper human body measurements show an average variance of less than ±1 cm. Figure 14 also displays the error correlation for each individual participant.

From our detailed experiments, we note that the maximum measured differences from ground truth occurred when a participant wore creased clothing or when the camera’s focal distance was rather close or rather far from the participant (i.e., either less than 0.5 m or more than 3 m). However, when the background was highly cluttered or when the lighting was not well diffused, our software’s inaccuracy rose by an average of only 0.1 cm, demonstrating that our method is able to improve the image quality and reduce adverse effects of noise.

The tests conducted in this research have demonstrated the significance of high-quality images being uploaded by the users. In general, current RGB cameras generate sufficiently high resolution images for this type of application. Having clear directions and examples in a tutorial can alleviate challenging circumstances. Our experience has shown that this can lower the mean difference of our software in our results by up to ±0.5 cm. From a designer’s perspective, the ability to measure human body circumferences with an average difference of at most 1 cm from the customer’s body shape enables more efficient processing from raw materials to completed products and, we expect, should significantly reduce returns and wastes.

Comparing our results with existing software and technologies in the market reveals several advantages of our system. Existing applications typically suffer from limitations, such as the need for external devices like 3D scanners, which may not be easily accessible to all users. Other applications based on widely accessible camera phones or tablets may require users to stand still for a few seconds without any movement, and some even necessitate a plain uniform background for accurate measurements. In many cases, users must stand at an exact distance from the camera to ensure accurate calibration. Most apps require users to provide their height information for calibration purposes. However, it has been observed that a significant number of users do not possess precise height information, which can impact the calibration process and subsequent measurements. Additionally, some software solutions provide only coarse ranges of measurements, such as the small, medium, large, XL, or XXL classifications used in retail.

In contrast, based on a camera phone or tablet, our system aims for simplicity while maintaining sufficient accuracy. It does not require re-training or access to large datasets, as it relies only on a pre-trained network and this solely for initial 2D image segmentation into a set of semantically useful ROIs. Our system does not require users to stand still for extended periods or against a specific background. It is designed to handle varying distances between the user and the camera, providing flexibility without compromising measurement accuracy. Furthermore, our system minimises the dependency on user-provided height information, reducing the impact of inaccurate or unknown height measurements. Additionally, our system provides detailed measurements beyond coarse size classifications, enabling more precise fitting and customisation.

5. Conclusions

In this research, the proposed technique aims to improve and facilitate the experience of the e-commerce apparel and fashion sector by providing a simple method for the typical user to estimate their characteristic human body measurements from a pair of 2D images, captured with a portable device, such as a modern camera phone or a tablet. Our findings were focused on the human female upper body (hips to head), which offers more variability in shape.

Our initial goal was to design a method for anthropometric measurements based on 2D images to estimate human body horizontal circumferences with an average error of less than one centimetre. Our analysis resulted in the selection of two different formulas to estimate such circumferences as best-fitting ellipses. While Yao et al. [27] favoured the more elaborate super-ellipses for estimating human body cross-sections such as the torso and limbs, our results indicate that the use of simpler ellipses gives sufficient accuracy. We propose that our system’s robustness against cluttered backgrounds and varying lighting conditions, coupled with its simplicity and accuracy, makes it a valuable tool for the apparel industry. Practical improvements, such as providing clearer directions and tutorials to users, have the potential to reduce further the mean error of our system to much less than a centimetre.

Finally, we note that our design choice of a hybrid architecture has proven valuable. By limiting the use of machine learning to a simpler task of ROI identification, we have avoided the need to access a large specific training dataset. By construction, the ROIs are directly related to useful anatomical body parts, informed by knowledge from the fashion design sector. They are kept relatively small in size, which facilitates and improves the performance of our set of computer vision techniques for automatically identifying markers.

Author Contributions

Both authors contributed equally to the conceptualisation, methodology, research, validation, and writing (original draft preparation, as well as review and further editing). M.M. was responsible for the software development and testing and data curating. All authors have read and agreed to the final version of this manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was approved by the Ethics Committee of Goldsmiths University of London (13 October 2021).

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study.

Data Availability Statement

The datasets generated or analysed during the current study are available upon reasonable request. Researchers interested in accessing the data may contact corresponding author (m.montazerian@gold.ac.uk). The datasets will be provided in Excel sheets to facilitate data sharing.

Acknowledgments

We express our sincere gratitude to Nevena Nikolova for her invaluable assistance in our project, thanks to her expertise in the fashion industry and generous support in data collection. More details on her work can be found on her website at https://nevenacouture.com/ accessed on 11 December 2023.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The selected anthropometric measurements are shown in Table A1.

Table A1. Anthropometric measurements chosen from ISO 8559 standard.

Title	Measurement Type	Title	Measurement Type
opx	Chest/bust girth	opn	Neck girth
opp	Under chest/bust girth	oph	Wrist girth
ot	Waist girth	SvRv	Shoulder length
ob	Hips girth	SyTy	Back length

Appendix A.1. Error Correlation for the Five Existing Apps in Detail

The following statistics in Figure A1 and Figure A2 represent the average error in centimetres of different horizontal sections of the human body for five different applications: 3DLookMe [31], SizeStream (MeThreeSixty) [32], Esenca [33], Presize [34], and TechMed [35].

Figure A1. Male error correlations.

Figure A2. Female error correlations.

Appendix A.2. Data Collection and Dataset Segmentation

Data have been collected from 78 participants, totalling 312 images (only images taken from the front view). Each image was annotated, initially isolating the human body from the surrounding environment and subsequently identifying distinct sections of the body, including the head, chest, shoulder, etc. For the execution of a comparative study on various object detection algorithms, this dataset has been thoughtfully segmented into training and test datasets, with an 80:20 ratio, respectively. Thus, the training dataset encompasses 249 images, while the test dataset consists of 63 images. This subdivision facilitates a structured approach to evaluating and comparing the performance and effectiveness of the object detection algorithms being assessed.

Table A2. Performance metrics for MobileNet SSD Object Detection Model in our system.

Measure	MobileNet SSD
Precision	0.79
Recall	0.54
F1-score	0.61
mAP@.5	0.70
mAP@.5:95	0.55
Inference time (ms)	8.4

Appendix A.3. Calculation of an Ellipse Perimeter

We approximate the human body using a series of ellipses stacked vertically along a main body axis, seen as a number of horizontal slices. For any such slice, the circumference of the human body can be approximated using the equation for an ellipse. To determine the perimeter of the ellipse accurately, for which there are no closed-form solutions, we need to calculate the semi-major and semi-minor axes. In our study, we evaluated six different approximations from the literature to compute an approximation to the perimeter of an ellipse, with the aim of identifying a compromise between accuracy and complexity [36]. In this paper, we presented the two selected approximations, namely Equations (1) and (2). The remaining four approximations we considered are provided below:

P \approx π (x_{1} + x_{2})

(A1)

P \approx π \sqrt{2 x_{1}^{2} + x_{2}^{2}}

(A2)

P \approx π (x_{1} + x_{2}) (1 + \frac{3 h}{10 + \sqrt{(4 - 3 h)}}), where h = \frac{{(x_{1} - x_{2})}^{2}}{{(x_{1} + x_{2})}^{2}}

(A3)

P = 4 \int_{0}^{π / 2} \sqrt{x_{1}^{2} c o s^{2} θ + x_{2}^{2} s i n^{2} θ} d θ

(A4)

Equation (A3) shows how variations in the eccentricity of ellipses can have a significant effect on the accuracy of the perimeter estimation. Note that for Equation (A4), although exact in theory, the integral provided can only be approximated numerically in practice. We ruled out Equations (A4) and (A3), judging that the slight increase in accuracy was not worth the increase in computational complexity.

Figure A3. A comparison of various formulae for the perimeter of an ellipse. Vertical axis: logarithmic scale measuring error as ratios or percentages. Horizontal axis: ellipse aspect ratio compared to a circle.

References

Tomi, B.; Sunar, M.S.; Mohamed, F.; Saitoh, T.; Bin Mokhtar, M.K.; Luis, S.M. Dynamic Body Circumference Measurement Technique for a More Realistic Virtual Fitting Room Experience. In Proceedings of the IEEE Conference on e-Learning, e-Management and e-Services (IC3e), Langkawi, Malaysia, 21–22 November 2018; pp. 56–60. [Google Scholar] [CrossRef]
Xiaohui, T.; Xiaoyu, P.; Liwen, L.; Qing, X. Automatic human body feature extraction and personal size measurement. J. Vis. Lang. Comput. 2018, 47, 9–18. [Google Scholar] [CrossRef]
Chang, H.T.; Li, Y.W.; Chen, H.T.; Feng, S.Y.; Chien, T.T. A Dynamic Fitting Room Based on Microsoft Kinect and Augmented Reality Technologies. In Human-Computer Interaction, Interaction Modalities and Techniques, Proceedings of the 15 International Conference, HCI International 2013, Las Vegas, NV, USA, 21–26 July 2013; Kurosu, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 177–185. [Google Scholar] [CrossRef]
Chandra, R.N.; Febriyan, F.; Rochadiani, T.H. Single Camera Body Tracking for Virtual Fitting Room Application. In Proceedings of the 10th International Conference on Computer and Automation Engineering, ICCAE, ACM, Brisbane, Australia, 24–26 February 2018; pp. 17–21. [Google Scholar] [CrossRef]
Ashmawi, S.; Alharbi, M.; Almaghrabi, A.; Alhothali, A. FitMe: Body Measurement Estimations Using Machine Learning Method. Procedia Comput. Sci. 2019, 163, 209–217. [Google Scholar] [CrossRef]
Choudhary, S.; Iyer, G.; Smith, B.M.; Li, J.; Sippel, M.; Criminisi, A.; Heymsfield, S.B. Development and validation of an accurate smartphone application for measuring waist-to-hip circumference ratio. NPJ Digit. Med. 2023, 6, 168. [Google Scholar] [CrossRef] [PubMed]
Danadoni, F.; Mechan, J.; Johnson, C. “PS for You”’: A Contactless, Remote, Cloud-Based Body Measurement Technology Powered by Artificial Intelligence. In Proceedings of the 12th International Conference and Exhibition on 3D Body Scanning and Processing Technologies (3DBODY.TECH), Lugano, Switzerland, 19–20 October 2021. [Google Scholar] [CrossRef]
Johnson, C.; Danadoni, F. A Pilot Study Using a Remote, AI-Powered Measurement Technology to Enable a Decentralised Production System, from Ideation to Delivery. In Proceedings of the 13th International Conference and Exhibition on 3D Body Scanning and Processing Technologies (3DBODY.TECH), Lugano, Switzerland, 25–26 October 2022. [Google Scholar] [CrossRef]
Kaashki, N.N.; Hu, P.; Munteanu, A. Anet: A Deep Neural Network for Automatic 3D Anthropometric Measurement Extraction. IEEE Trans. Multimed. 2023, 25, 831–844. [Google Scholar] [CrossRef]
Choutas, V.; Müller, L.; Huang, C.H.P.; Tang, S.; Tzionas, D.; Black, M.J. Accurate 3D Body Shape Regression using Metric and Semantic Attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; Available online: https://shapy.is.tue.mpg.de/ (accessed on 1 September 2022).
Mogavero, T. Clothed in Conservation: Fashion & Water. Sustainable Campus, Florida State University. 2020. Available online: https://sustainablecampus.fsu.edu/blog/clothed-conservation-fashion-water (accessed on 23 July 2020).
World Wildlife Fund. Threats: Water Scarcity. 2023. Available online: https://www.worldwildlife.org/threats/water-scarcity (accessed on 23 July 2020).
Brooks, A.L.; Brooks, E.P. Towards an inclusive virtual dressing room for wheelchair-bound customers. In Proceedings of the International Conference on Collaboration Technologies and Systems (CTS), IEEE, Minneapolis, MN, USA, 19–23 May 2014; pp. 582–589. [Google Scholar] [CrossRef]
PRNewswire. Consumers to Return Half of Online Clothing Purchases This Holiday Season. Technical Report. 2018. Available online: https://www.prnewswire.com/news-releases/consumers-to-return-half-of-online-clothing-purchases-this-holiday-season-300760466.html (accessed on 2 May 2023).
Halilday, S. Online Fashion Returns Soar as Shoppers Lack Size Info. Technical Report, Fashion Network. 2018. Available online: https://ww.fashionnetwork.com/news/Online-fashion-returns-soar-as-shoppers-lack-size-info,957327.html (accessed on 2 May 2023).
Inman, D. Retail Returns Increased to $761 Billion in 2021 as a Result of Overall Sales Growth Gradients. Technical Report, National Retail Federation (NRF). 2022. Available online: https://nrf.com/research/customer-returns-retail-industry-2021 (accessed on 2 February 2023).
Faust, M.E.; Carrier, S. Designing Apparel for Consumers: The Impact of Body Shape and Size; Woodhead Publishing: Sawston, UK, 2014. [Google Scholar]
ISO 8559-1:2017; Size Designation of Clothes—Part 1: Anthropometric Definitions for Body Measurement. International Organization for Standardization: Geneva, Switzerland, 2017.
Lucas, T.; Henneberg, M. Use of units of measurement error in anthropometric comparisons. Anthropol. Anz. 2017, 74, 183–192. [Google Scholar] [CrossRef] [PubMed]
Ulijaszek, S.J.; Kerr, D.A. Anthropometric measurement error and the assessment of nutritional status. Br. J. Nutr. 1999, 82, 165–177. [Google Scholar] [CrossRef] [PubMed]
Meyer, P.; Birregah, B.; Beauseroy, P.; Grall, E.; Lauxerrois, A. Missing body measurements prediction in fashion industry: A comparative approach. Fash. Text. 2023, 10, 37. [Google Scholar] [CrossRef]
Howard, A.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Maragos, P. Morphological filtering for image enhancement and feature detection. In The Image and Video Processing Handbook; CRC Press: Boca Raton, FL, USA, 2005; pp. 135–156. [Google Scholar] [CrossRef]
Shapiro, L.; Stockman, G. Computer Vision; Prentice Hall: Upper Saddle River, NJ, USA, 2001. [Google Scholar]
Rajinikanth, V.; Raja, N.S.M.; Dey, N. A Beginner’s Guide to Multilevel Image Thresholding; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar] [CrossRef]
Kolkur, S.; Kalbande, D.; Shimpi, P.; Bapat, C.; Jatakia, J. Human Skin Detection Using RGB, HSV and YCbCr Color Models. In Proceedings of the International Conference on Communication and Signal Processing (ICCASP 2016), Lonere, India, 26–27 December 2016; Atlantis Press: Dordrecht, The Netherlands, 2017. [Google Scholar] [CrossRef]
Yao, J.; Zhang, H.; Zhang, H.; Chen, Q. R&D of a parameterized method for 3D virtual human body based on anthropometry. Int. J. Virtual Real. 2008, 27, 9–12. [Google Scholar]
Saito, S.; Simon, T.; Saragih, J.; Joo, H. PiFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 81–90. [Google Scholar] [CrossRef]
ISO 8559-2; Size Designation of Clothes—Part 2: Primary and Secondary Dimension Indicators. International Organization for Standardization: Geneva, Switzerland, 2017.
ISO 8559-3; Size Designation of Clothes—Part 3: Methodology for the Creation of Body Measurement Tables and Intervals. International Organization for Standardization: Geneva, Switzerland, 2018.
3Dlook.Me. 2023. Available online: https://3dlook.me/ (accessed on 11 December 2023).
SizeStream. 2023. Available online: https://www.sizestream.com/ (accessed on 11 December 2023).
Esenca. 2023. Available online: https://www.esenca.app/ (accessed on 11 December 2023).
Presize. 2023. Available online: www.presize.ai/ (accessed on 12 December 2021).
TechMed3D. 2023. Available online: https://techmed3d.com/ (accessed on 11 December 2023).
Berndt, B.C. Ramanujan’s Notebooks: Part I; Springer: New York, NY, USA, 1985. [Google Scholar] [CrossRef]

Figure 1. Average differences (in cm) and standard deviations for the selected five existing commercial online applications.

Figure 2. Proposed method as a diagram view of a pipeline of steps and processes.

Figure 3. Image obtained by computing a set of ROIs from a set of training images. The full body ROI (labelled “Person”) is only used for illustrative and comparative purposes and is not required in our pipeline.

Figure 4. Illustration of the use of our skin detection method, combined with a background segmentation to show different skin regions being detected and how hair in the neck and shoulder areas can be separated from the body contours. In our pipeline, we only apply this technique to selected ROIs.

Figure 5. Illustration of the procedure for automatic marker detection, applied to individual ROIs. A proper A-pose leads to a useful automatic marker localisation (red dots, also indicated by arrows for greater visibility).

Figure 6. Illustration of the procedure for automatic marker detection (red dots), which leads to incorrect localisation in some cases. An incorrect A-pose (with the arms kept too close to the upper body) results here in the inaccurate automatic detection of some markers, near the chest and bust areas.

Figure 7. Automatic and interactive localisation of detected markers. The first two images on the left illustrate the markers automatically detected (red dots) for the front and side views. The other two images on the right correspond to the final marker localisation after the user has modified some markers. Only the third image (front view) sees two pair of markers having been moved horizontally (yellow dots); other markers are judged fine by the user and kept as provided by the system.

Figure 8. Illustration of typical variations in horizontal upper body slices together with their fitted best ellipse. (A–C) represent the chest circumferences of various people, while (D–F) indicate the waist circumferences of different individuals.

Figure 9. Female template with examples of virtual body cross-sections (image courtesy of ISO www.iso.org (accessed on 1 September 2021)).

Figure 10. Illustration of a participant’s waist circumference, showcasing the variability in shape yet still approximately elliptic. The major axis is represented by the green line, while the minor axis is indicated by the blue line. The blue dashed line illustrates that the waist profiles of humans resemble an ellipse.

Figure 11. This comparison specifically focuses on elliptic profiles where the major axis is at most 3 times longer than the minor axis.

Figure 12. This comparison specifically focuses on elliptic profiles where the major axis is more than 3 times longer than the minor axis.

Figure 13. Key statistical characteristics for four body sections: chest, bust, waist, and hips, with data collected using Equations (1) and (2). Vertical axes show error levels expressed in centimetres.

Figure 14. The plots correspond to the average error difference for the circumferences of the chest, bust, waist, and hips in different environments.

Table 1. A summary of the differences between our software and tape-measured data for the 78 participants (312 total cases).

Body Type	Chest	Bust	Waist	Hips
Mean Differences (cm)	0.78	0.78	0.96	0.78
Median Differences (cm)	0.76	0.76	0.82	0.76
Max Differences (cm)	3.14	2.66	3.19	3.82
Min Differences (cm)	0.01	0.01	0.02	0.02
Standard Deviation (cm)	0.55	0.50	0.62	0.56

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Montazerian, M.; Leymarie, F.F. Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements. Symmetry 2024, 16, 49. https://doi.org/10.3390/sym16010049

AMA Style

Montazerian M, Leymarie FF. Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements. Symmetry. 2024; 16(1):49. https://doi.org/10.3390/sym16010049

Chicago/Turabian Style

Montazerian, Mohammad, and Frederic Fol Leymarie. 2024. "Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements" Symmetry 16, no. 1: 49. https://doi.org/10.3390/sym16010049

APA Style

Montazerian, M., & Leymarie, F. F. (2024). Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements. Symmetry, 16(1), 49. https://doi.org/10.3390/sym16010049

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements

Abstract

1. Introduction

Article’s Organisation

2. Background

2.1. Traditional Measurements

2.2. Recent Computerised Methods

2.3. Deep Learning Methods

2.4. Overview of the Current Situation in the Fashion and Retail Industry

2.5. Challenges and Motivations: Existing Body Sizing Applications

3. Method

3.1. Body Parts via ROIs

3.2. Image Corrections and Processing

Regional Skin Detection

3.3. Selecting Markers

3.4. Computing the Perimeter of Fitted Ellipses

3.5. Camera Calibration

4. Result and Discussion

4.1. Ellipse Model Results

4.2. Final Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Error Correlation for the Five Existing Apps in Detail

Appendix A.2. Data Collection and Dataset Segmentation

Appendix A.3. Calculation of an Ellipse Perimeter

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI