Data Engineer (Big Data -> Hadoop / Spark / Kinesis -> Strategy & Delivery)
Bristol OR Malmesbury, United Kingdom
DescriptionDyson are using Amazon’s leading edge cloud based analytics solutions (Hadoop, ElasticSearch, Kinesis and Athena) to develop a world class IoT Big Data solution. Be part of the team driving real business and consumer from petabytes of machine and user data.
Market OverviewIn May 2016 Dyson launched the second of its connected products, the Dyson Pure Cool Link, which joins our existing 360 Eye robot vacuum cleaner with its companion eco-system, Dyson Link. Dyson Link is our IoT solutions to enable Dyson products to work in a connected environment. It includes the key components required to create a connected product, from mobile apps, web/CRM integrations and cloud services (provisioning, asset management, message routing, product/app analytics, persistence and scaling).
Function OverviewThe Connectivity Architecture Team are responsible for ensuring the next generation of connected products and technologies are properly explored, tested and refined in readiness to transition to the team responsible for delivery of our IoT solutions into production. This includes ensuring we investigate and utilise the right technologies, techniques, services and security. Additionally, Dyson products will need to operate in a wider connected environment requiring interoperability to be explored and evolved.
Strategy / Liaison
- Ensure that current trends, developments and improvements in data acquisition, processing and reporting are understood and investigated.
- Understand the needs for analysis of M2M data within the wider organisation and develop a strategy to achieve success.
- Creation of a platform to allow the realisation of our future Machine Learning initiatives.
- Work with the Legal and Security teams to establish a strategy for securing and management data appropriately.
- Act as the Architect for M2M data and lead the wider business teams in the creation of aligned requirements.
- Manage a team of Data Engineers, designing, developing and configuring solutions to meet the business needs for M2M data.
- Work with the wider data teams within Dyson to ensure M2M data is aligned with the data from Line of Business systems, to provide holistic reporting.
- Ensure that the technical teams deliver strategically aligned and capable systems for the data lifecycle.
- Work with existing development and research teams within Dyson to set direction and guide development to enable new solutions and approaches.
- Explore opportunities within new data platforms and products to demonstrate possibilities and contribute to the future direction of our connected solutions.
- Ensure that product and user requirements for such solutions are analysed and captured to assist in the definition of product vision specifications.
- Develop and explore the toolsets at hand (AWS) for ingestion, processing and reporting of M2M data.
- Assist if the design on our messaging systems to ensure our analytics needs can be met.
- Assist in the prototyping of data systems connectivity based solutions and to demonstrate proof of concept applications.
- Conduct testing on released Dyson connectivity products in order to close the loop and feedback learning into future connectivity strategy.
- Develop the next generation of AWS EMR based analytics and reporting tools.
- Handover solutions to the business as usual teams, providing backup support as necessary.
- Experience in data cleansing, schema reconciliation, and related tools (e.g. Uber Paricon)
- API scalability, load balancing, and security (REST, MQTT, JSON, GraphQL, encryption, key management)
- Strong background in data and analytics systems, primarily the Apache Big-Data toolsets (e.g. HBASE, Cassandra, Kafka, and Spark).
- Experience in architecting and building connectivity data systems and integrations.
- Experience of integrating very large datasets with line of business data.
- Proven track record of delivering strategic solutions for end-users.
- Passion for agile development of very well tested code (unit and integration) in a containerised/cloud environment
- Exposure to CI/CD development scenarios (Jenkins, Jira, nexus, git, canary deployment)
- Experience in scalability/failure testing (Simian Army/Chaos Monkey)
- Proven track record of developing robust requirements specifications.
- Flexible and dynamic approach to development.
- The ability to quickly adopt new concepts, languages and techniques quickly and convey the benefits to others.
- Good understanding and experience of agile application development practices.
- Ability to communicate complex ideas simply.
- Proven ability to work in an interdisciplinary team.
- Experience monitoring a data infrastructure at scale.
- Exposure to configuration management tools (Amazon CloudFormation, Elastic BeanStalk, Chef, Puppet).
- Some exposure to UI development.
- Experience of M2M data environments.
- Experience of AWS Data Analytics toolsets, including EMR, ElasticSearch, Kinesis and Redshift.
- Strong communication skills and the ability to build strong relationships with others.
- The ability to empower others and develop the team.
- Able to take accountability for deliverables and focus on achieving them in a timely manner to the highest quality standards.
- Customer focused and keen on exceeding expectations.
- A strong understanding of product development.
- Self-motivated, dynamic and results-driven.
- Professional in tense or challenging situations.
- Clear communication, good inter-personal skills at all levels of contact.
- Ability to work across boundaries and bring together a wider number of disparate elements in a cohesive way to enable a vision.
- Strong planning & time management skills.
- Ability to prioritise and manage workload (both your own and others).
- Sensitive to cultural differences across a global company.
- Ability to be flexible as part of a small team in a growing company.