spondon majumdar

CEO

Bengaluru, Karnataka, India12 yrs 10 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Ten years of experience in big data and NLP.
Expert in automating data generation and extraction.
Proven track record in startup environments.

Stackforce AI infers this person is a Big Data and NLP specialist with a strong focus on automation in SaaS environments.

Contact

Skills

Core Skills

Big DataNatural Language ProcessingSoftware Engineering

Other Skills

AngularJSAnsibleAutomated data generationAutomated product discoveryBloomfilterCC++Core JavaData optimizationEclipseElastic SearchGateGitGraph DatabasesGraphite

About

Engineer by education , traveler by passion and software engineer by profession, I have overall ten years of work experience. Words like startups,algorithm, beautiful code, architecture best describe me professionally. I love to deal with data, lots of data. As a big data professional my primary responsibilities is to represent data, beautiful data, data pattern matching, clean and of course automatically data generation. I have worked with three top notch startups in past and now looking to solve some complex computer science problem with the help of cutting edge technologies..

Experience

Probe equity research

confidential

Nov 2015 – Present · 10 yrs 4 mos · Bengaluru Area, India

At Probe I am dealing with
Automated large scale of data generation
As a R&D team member my primary responsibility was to move manual towards automaton to reduce the cost as well us improve efficiency and accuracy. Thus we have introduced a back end application that will generate and collect data from web and directly linked with probe data gathering app.
PDF parsing (using open source tool and further improvement)
Till now Probe has been dependable with third party solutions that will identify PDF's and then you have to manually tell them what part you want to extract and semi-manually extract them and put them to probe data app.
I have built an application using open-source tool & nlp that will automatically extract PDF table information in you desired format this have reduced unnecessary third party solution cost rapidly .
My solution was robust , dependable and accuracy was more than 75%.
HTML parsing
Parsing large scale of ajax based page and extract the relevant information to give a insight value of the data.
Name Entity Recognition
Using Stanford NLP parser extract PDF and get the meaningful financial information.
Text Extraction
Large scale of text file extraction using Regex and other nlp techniques .
Module Linking
Data optimization
Pipeline Improvement

Automated data generationPDF parsingHTML parsingName Entity RecognitionText ExtractionData optimization+3

Promptcloud

Senior Software Engineer

Aug 2013 – Oct 2015 · 2 yrs 2 mos · Bengaluru Area, India

In PromptCloud I have involved in different type of components starting from client interaction to monitoring our cloud cluster and streamline our tech stack.We are mainly involve in mass scale of data generation from various sources from web ,hosted indexing and analysis on the top of data.Some of my components that help PromptCloud to enrich their tech pipeline's are
Built a framework for automated product(url) discovery
Riak : Added riak to our tech pipeline to prevent duplication , main advantage of riak , it's distributed and we have started using this from data link generation to data cleanup process.
Resque : Implemented the integration of resque with our tech stack to create background jobs, placing them on multiple queues, and processing them later.
Bloomfilter : A small sub set of riak, basically it's a in memory hash which we use as key/value pair for our internal duplication process.
Selenium : Browser like instance we use this for auto filling or execute some human like behavior.
Rabbitmq : Improved some of our queue mechanism procedure.
Introducing new components in pipeline : Created a ruby gem to generate client specific plugin as per requirements, this have reduced human interaction with our core code base. Introduced automated proxy generation from various sources to prevent blocking in complex sites. Integrated our requirement gathering dashboard [http://app.promptcloud.com/] to our tech-pipeline .
Functionality Improvement & Automation : Automated our data extraction module, improved our fetch page mechanism, improvement of pipeline report generation, introduced uses of config file.
Monitoring Tool : Graphite , Logstash
Testing Tool : Minitest, RubyCritic,SimpleCov
Project Management Tool : Redmine
Elastic Search , Redis , Kibana, Sinatra are also integral part of our tech pipeline and we are frequently uses them.

Automated product discoveryRiakResqueBloomfilterSeleniumRabbitMQ+8

Sigmoid analytics

Trainee Natural Language Prosessing Developer

Feb 2013 – Jun 2013 · 4 mos · Bangalore

To be a part of the challenging team which strives for the better growth of the organization and which explores my potential and provides me with the opportunity to enhance my talent with an intention to be an asset to the company.
Experience Summary
A java and natural language processing professional with 4 months of overall experience .
Worked on projects involving structured unstructured data, name entity recognition, pattern matching, part of speech recognition etc.

JavaNatural Language Processing