1) Have you tested / benchmarked the system?
Yes, we have done testing with a number of test datasets. Results over three externally-constructed test datasets on 3-class sentiment analysis showed that SentiMo has 1.2-2.9% accuracy advantage as compared to a commercially leading tool. We are inviting companies to
sign up licenses to further evaluate and use the tool for their respective use cases.
Fig 2. Performance comparison results
2) Did you use open source tools?
For the lexicon database, we constructed them from scratch, referencing publicly available linguistic resources such as English Dictionary, Wikipedia, Urban Dictionary, as well as published academic papers. We coded the engine 100% in-house at the Institute of High Performance Computing, A*STAR.
3) When is the system ready?
SDK Version 3.0 has been released for our current partners since 28 February 2017.
4) What does SentiMo's sentiment output mean?
The main part of SentiMo's output is on "Sentiment". It has the following six values:
|Neutral||0||Neither positive nor negative|
|Mixed-Neutral||1||Contains both positive and negative sentiments with equal weightage of each|
|Negative||2||Contains only negative sentiments|
|Mixed-Negative||3||Contains both positive and negative sentiments, but with a stronger weightage of negative sentiments|
|Positive||4||Contains only positive sentiments|
|Mixed-Positive||5||Contains both positive and negative sentiments, but with a stronger weightage of positive sentiments|
5) SentiMo produces six classes of Sentiment outputs. How can I convert SentiMo's outputs to other common schemes?
Converting SentiMo's 6-class outputs to 4-class (quaternary) is straightforward: Convert 0-Neutral as "Neutral"; 1-Mixed-Neutral, 3-Mixed-Negative, 5-Mixed-Positive as "Mixed"; 2-Negative as "Negative"; 4-Positive as "Positive"
Converting SentiMo's 6-class outputs to 3-class (trinary) is also straightforward: Convert 0-Neutral as "Neutral"; 1-Mixed-Neutral, 2-Negative, 3-Mixed-Negative as "Negative"; 4-Positive, 5-Mixed-Positive as "Positive"
Converting SentiMo's 6-class outputs to 2-class (binary) is complex and not recommended. The main reason is that purely 2-class sentiment expression (without neutral utterance) does not commonly exist in real-world context. One of a possible approximation of the conversion may look like:
Convert 0-Neutral, 1-Mixed-Neutral, 2-Negative, 3-Mixed-Negative as "Negative";4-Positive, 5-Mixed-Positive as "Positive".
6) What do SentiMo's emotion-related outputs mean?
The emotion outputs highlight if the text message expresses or implies one or more six common emotion types (more generally, affective states): Satisfaction, Happiness, Excitement, Sadness, Anxiety and Anger. Note that a text message may convey two or more types of emotions such as excitement and sadness simultaneously.
Each of the six emotion outputs (Satisfaction, Happiness, Excitement, Sadness, Anxiety and Anger) has the following values:
|0||No||Does not express or imply this emotion|
|1 or more||Yes||Expresses or implies this emotion, and the higher value indicates higher strength |
7) There are also other outputs. How should I use them?
SentiMo also produces three additional outputs. "Negate" indicates if there is a negation in the text message. "Positive" indicates if the text message expresses or implies a positive meaning (higher value indicates higher strength).
"Negative" indicates if the text message expresses or implies a negative meaning (higher value indicates higher strength).
These outputs are meant to provide extra information about SentiMo's analysis logic. You may also ignore them and focus on the main outputs.
8) How come my tested text/tweets do not work?
There are two possible reasons. First, automated sentiment and emotion judgment is a well-known very hard problem itself, and 100% accuracy is not even a target. Second, it may be due to the ambiguity of the test tweet that even
humans do not agree with each other. In addition, it may also be due to the complex or nuanced meaning of the test tweet. Future major upgrade of SentiMo is expected to handle more complex or nuanced utterance.
9) Can your system handle other inputs, such as other languages (e.g., Chinese, Malay) / sarcasm / humor / images / speech / videos?
Multi-modal extension is scoped in our R&D agenda. We will release future major upgrades that tackle these extensions.
10) What programming languages do you use?
We used Java to implement the system. However, SentiMo is available in standard RESTful API and SDK so they can be well integrated in various platforms.
11) What are the hardware and software requirements for setup SentiMo SDK?
We recommend system setup with the following. Other similar setups may work too.
Disk space requirements will vary with the data you have and increment.
10 GB for SentiMo data storage
1 GB for SentiMo application
Minimum: 8 GB for SentiMo application
Recommended: 16 GB or more for SentiMo application
Minimum: 4 GB for MySQL
Recommended: 8 GB or more for MySQL
Minimum: x64 Processor: 2.0 GHz, 4 CPU cores
Recommended: 3.2 GHz or faster, 8 CPU cores or more
|Operating System||Recommended: CentOS 7 64bit|
|JRE||JRE 1.8 or above|
|My SQL||5.6.28 or above 64bit|
12) I heard about the "Singlish" feature in SentiMo. What is that?
Yes, it may be interesting to note that SentiMo incorporated a lexicon of over 400 common words and phrases in Singapore Colloquial English ("Singlish") (e.g., shiok, paiseh, alamak).
To the best of our knowledge, it is the first sentiment analysis tool that has this capability.
13) Can I customize / extend with my own lexicon?
Yes, SentiMo API & SDK 3.0 provides an interface for users to add new words / phrases, edit and modify the added new words / phrases. Please refer to the API
Documentation Section 4 "Knowledge Management" for details.
14) How is SentiMo different from social media services / platforms provided by XYZ company?
SentiMo is a social media text data processing tool. As a data processing tool, SentiMo has distinct technological and system features from other tools as introduced in this site.
This is different from social data analytics providers that provide services such as offering data collection ("social listening") platforms and consultancy services.
SentiMo complements these social data analytics providers. The common applications we've seen so far are using SentiMo as a tool to augment social data analytics providers' capability in terms of the speed and accuracy of processing raw data with sentiment and emotion outputs. These social data analytics providers serve clients in a wide range of sectors such as retail & consumer businesses, hotels and tourism, social policy and public administration.