Our Client is the largest mutual life insurance company in the United States. Founded in 1845, it is headquartered in New York City, maintains offices in all fifty states, and owns Seguros Monterrey in Mexico.
You will be part of Data & Platform sub-function team under Center for Data Science and Analytics. The Data & Platform team services internally to Data Scientists who focus on Statistical analysis.
You will be part of a fast paced, high-impact team who will work with an entrepreneurial mindset using some of the best of breed tools as part of our Enterprise Data Lake (Hadoop) using R, Spark and Python.
You will apply your data engineering skills to build pipelines, workflows to gather, cleanse, test and curate datasets from Oracle, MSSQL Server, 3rd party data and create datasets in Enterprise Data Lake (Hadoop) which will be used by several teams of predictive modelers.
You will perform Proof of Concepts and test out new software tools under the umbrella of Data Science but geared more towards data engineering.
- Ingests, merges, prepares, tests, documents curated datasets from various novel external and internal datasets for a variety of advanced analytics projects such as Multi-variate model for Risk, Marketing and Compliance
- Utilizes data wrangling/data matching/ETL techniques while to explore a variety of data sources, gain data expertise, perform summary analyses and curate datasets
- Functions as data expert, contributes to analytics/solutions design and productizing decisions
- Can work independently with some supervision and be part of a collaborative team
- Work with Project Managers and Scrum Masters to provide milestones and stories
- Proactively and effectively communicates in various verbal and written formats with senior level member of the team and partners
- Graduate-level degree in computer science, engineering, or relevant experience in the field of Business Intelligence, Data Mining, Database Engineering, Programming
- 3-5 years of overall experience working in the field of data wrangling and programming with a minimum of 1 year experience with ingesting, cleaning, merging and applying necessary data wrangling logic in Hadoop
- 1+ years in writing complex SQL queries in any of the following and/or similar databases – Oracle, SQL Server, DB2, MySQL
- Proficiency using Python for all data related work such as Numpy, Pandas, PySpark
- Experience working with Linux Operating System
- Experience working with data visualization tools or packages
- Experience building Exploratory Data Analysis reports such as Histograms, Box plots, Pareto, Scatter Plot using R, Python or a Data Visualization tool such as Tableau, Spotfire
- Understanding of statistical modeling concepts, designs and analytics-based products
- Any experience in using ETL tools such as Ab Initio, Talend, Informatica, Pentaho
- Any experience working with Data Warehouses and/or Data Marts
- Any experience in Life Insurance business