Building a solution model for a real-life problem has always been an inspiration for me. There are many graph based solution models running in different domains such as healthcare, finance, fraud, COVID tracking, etc.
Today, environmental pollution has become a serious challenge for all of us. Nations are participating and pledging in COP26, SDG 13 for net zero carbon emissions.
And as a matter of fact, global warming, water pollution, dramatic changes in water cycles, and weather are all somehow connected to each other.
From here I got my idea to present these Environmental Pollution Connected Patterns in a Graph Based Solution Model and the Neo4j Leonhard Euler Idea Contest provided us an amazing platform to present it globally.
Just having an idea is not enough – we must show how it works and if it is valid enough to implement. Hence I gathered some sample datasets from the internet, which you can find on my Git.
The graph model is pretty simple. Just by looking at it, you can see how the connected patterns (relationship arrows) can help us to find the possible industries responsible for the pollution in an area by linking them with their respective area’s air/water quality measures and disease cases, so that immediate actions can be taken on root level with proof of patterns.
Did I Say Proof of Patterns? Why not?? If every industry is working as per norms, then why are we not able to track them down easily. What is making it so complex? Why does it take so much time? The answer may be the “connections”… We are not able to find connections with proof. Maybe Graph can help us to simply the whole process itself.
Below is the brief of the solution model data points. City and Pollutant nodes are linked via HAS_POLLUTANTS relationship – the air quality measures will be there at every fixed interval of time.
Pollutant nodes found and Disease nodes reported in City.Industry nodes must be using some RawMaterials and following manufacturing Process which can be linked/related to the pollutants and hazardous chemicals.
Below, a simple query helps reveal the industries that are responsible for the pollution and disease in the city.
MATCH (c:City)<-[r:REPORTED_IN]-(d:Disease)-[:RELATED_TO]->(p:Pollutants)<-[r1:HAS_POLLUTANTS]-(c)
WHERE r.reportedYear=r1.reportedOnYr
AND r1.reportedLevel>r1.maxPermissibleLevel
MATCH (p)-[:LINKED_TO]->()<–(i:Industries)-[:IS_IN]->(c)
RETURN DISTINCT r.reportedYear AS Year, c.city AS City,
p.pollutant as Pollutant, d.disease AS Disease,
r.patientCount AS DiseaseCaseCount, i.industry AS Industry
So you can see how easy it is to query the proofs:)
What’s the next possible update in the solution?
Bringing more factors such as Industry domains as nodes contributing to pollution. Linking the process and raw materials. Country wise analysis, etc.
This is all from me about my solution. Suggestions and improvements are most welcome – you can mention them in comments 🙂
You can also reach out to me on LinkedIn to discuss any cool ideas. 😉
Try AuraDB Free Now
Graph: A Possible Solution for Environmental Pollution! was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.