Hello, I’ve some curious questions about PundiAI being a decentralized training platform.
-
How do we prevent spambots from intentionally feeding random and/or wrong data aka data poisoning?
-
Will there be any measures in place to ensure quality control over the data?
- Centralized AI training have standardization and proper guidance from professors inputting the data, verifying, validating or even identify correct and incorrect data and eventually the AI can self-identify. (People who have a deep understanding and is within their domain of expertise)
For instance, what if you input too much data inconsistencies contributed by spambots, the AI model might reinforce its own feedback loop and amplify that specific incorrect information or bias, leading to a further degradation in quality.
- Is there a baseline to what is considered being “correct” or “wrong”?
From what I understand, centralized data training allows for full control over the input data, ensuring consistency and alignment in the quality of the information direction whereas decentralized training can be vulnerable to intentional data poisoning and manipulation fed by spambots from competitors or anyone.
Cause I was wondering - it’s quite hard to control the quality of decentralized data right?
- Like who will be the “one” to correct the wrong data?
Wouldn’t that require a centralized intervention which ultimately beats the purpose of it being decentralized?
Thank you! I hope to grasp it better with my limited understanding!