Created at: June 10, 2025 00:02
Company: Accenture
Location: Hartford, CT, 6101
Job Description:
We Are
Accenture is at the forefront of innovation, helping companies harness the power Generative AI, Large Language Models (LLMs), and agentic applications to transform their businesses. Our flagship platform, AI RefineryTM, empowers enterprises to optimize operations, drive innovation, and unlock new opportunities.
With over 1,600 professionals dedicated to generative AI, leveraging the depth and experience of more than 40,000 AI and data professionals across the company our Generative AI and LLM Center of Excellence brings together our Experienced Innovation, Strategic Investment, Exceptional Talent, and Power Ecosystem.
You Are
As an experienced and dynamic data scientist, you will develop innovative approaches to extract meaning and value from large and complex datasets by leveraging state-of-the-art LLMs and GenAI techniques. Collaborating across the data science lifecycle, you will design and implement both qualitative and quantitative methods to prepare, enrich, and optimize data for advanced analytical and agentic workflows. Your role will involve scripting and programming to aggregate, embed, and index data from diverse sources, transforming it into actionable intelligence aligned with customer requirements. Additionally, you will evaluate, document, and communicate your research processes, methodologies, and results with peers, stakeholders, and leadership to drive transparency, reproducibility, and innovation.
The Work
As a data scientist for our AI RefineryTM platform at Accenture, you will:
Data Preparation & Management:
Work with structured, semi-structured and unstructured data across domains such as customer operations, finance, life science and telecommunications.
Build scalable data pipelines and perform advanced preprocessing to prepare datasets for fine-tuning large language models (LLMs) and prompt-based generative AI tasks.
Design semantic layers, taxonomies, and domain-specific ontologies to enable contextual reasoning and knowledge representation for LLM-driven applications.
LLM & Agent Collaboration:
Prepare and structure training/evaluation data for prompt engineering, RAG, and fine-tuning.
Align datasets with downstream use cases such as customer journey analysis, marketing trends, and intelligent task scheduling.
Work closely with cross-functional teams, including, research engineers, developers to pass processed data into vector databases or knowledge graphs.
Cloud & Data Lake Integration:
Access, extract, and preprocess data from cloud data lakes and warehouses (GCP Azure AWS).
Ensure secure, scalable, and governed access to enterprise data sources used by LLM-based agents.
Travel may be required for this role. The amount of travel will vary from 0 to 100% depending on business need and client requirements.