| (英) |
Handmade products differ from mass-produced items as each item is uniquely crafted by hand. Therefore, setting initial prices for these products can be challenging, with inappropriate pricing potentially losing the seller's profit or reducing buyer's interest. Current price prediction methods mainly use two modalities: images and metadata, or images and text data. To the best of our knowledge, there is no price prediction method that uses the three modalities: images, metadata, and text data, especially one that is specifically designed for handmade items. Therefore, in this study, we create a multimodal neural network that leverages multimodal data including product images, meta data, and texts to create a specialized price prediction tool for handmade items. Our experimental results using actual handmade item data sold on Mercari showed that our model incorporating all three modalities achieved the highest Pearson correlation coefficient of 0.569. Compared to the results of using only both images and metadata, our model using all types of modalities showed an improvement of 0.071 points, and an improvement of 0.106 points compared to the model using both images and text data. The experiments demonstrated the effectiveness of using all three modalities for price prediction in handmade items. |