Job Detail

Find a Candidate

Specialty Development Senior- SRE, GCP, APM & Java (32234)

  • Job Category: Professional Hiring
  • Job Location: India
  • CTC: 16-28 LPA
  • Job Type: Full Time
  • Qualification: BE

Specialty Development Senior- SRE, GCP, APM & Java (32234)

Position Description

Site Reliability Engineer (SRE) with 5-8 years experience working in Cloud technologies
(preferably GCP)


Skills Required

The specific responsibilities of an SRE managing a large, distributed application built on
microservices, spring boot, and Google Cloud may include: • Strong background in software
development and systems administration, as well as excellent problem-solving and
communication skills. • Run the production environment by monitoring availability and taking a
holistic view of system health. • Developing, improving, and operating the deployment and
orchestration of a complex distributed system • Improve reliability, quality, and time-to-market
of our suite of software solutions • Measure and optimize system performance, with an eye
toward pushing our capabilities forward, getting ahead of customer needs, and innovating to
continually improve • Provide primary operational and engineering Support for multiple large,
distributed software applications • Identify and reduce or eliminate toil via automation to
maximize the time spent on engineering and innovation • Collaborating with development
teams to design, build, and operate scalable and resilient software systems • Automating
deployment, monitoring, and incident response processes • Performing root cause analysis of
production incidents and implementing preventive measures • Conducting performance
analysis and optimization of the system • Ensuring compliance with security and regulatory
standards • Implementing and maintaining disaster recovery processes • Providing technical
guidance and mentorship to other team members • Participating in an on-call rotation for
incident response and support.
Skills Preferred:
• 5 - 6 years’ experience with JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure
& Docker/K8 in Maintenance and Development of multi-tier applications. • Understanding of
RESTful APIs and microservices platform • 4 - 5 Years of experience with any of APM and other
monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka,
DataDog, PagerDuty. • Strong experience with product & development teams to establish error
budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators),
KPIs (Key performance indicators) and effectively drive the use of the budget to ensure
maximum domain availability/uptime. • Regularly review key site technical metrics such as
transactions errors, logging, response times, caching strategies, conversion/bounce rates,
capacity & resource utilization. • Proactively identify stability risks & work with engineering
leadership to establish appropriate mitigation plans • Experience in solving complex
architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.
• Architect, design & develop automation to reduce toil, improve recoverability, availability,
latency & scalability of supported applications with understanding of MTTD (Mean Time to
Detection) & MTTR (Mean Time to Resolution) • Maintain knowledge repository that includes
Standard operating procedure, Release checklists, Runbooks for incident recovery


Experience Required
5-8 years experience working in Cloud technologies
(GCP preferred)

Education Required

4 Year College Degree in Computer Science or
Equivalent.
Education Preferred:
Certifications in Google Cloud (GCP)

Work Type: Work from Office / Hybrid (Chennai Location)

Notice Period : Immediate / 15 Days only

EMPLOYER: FASTSWITCH (US SOFTWARE COMPANY)

CLIENT: FORD CHENNAI

Dear Candidate,

If you are interested to apply for this job, kindly send your latest updated resume, Current CTC / Expected CTC / Notice Period, and your LinkedIn Account Profile to haridassharrissces@gmail.com or haridass@harrissces.com
Position Description

Site Reliability Engineer (SRE) with 5-8 years experience working in Cloud technologies
(preferably GCP)


Skills Required

The specific responsibilities of an SRE managing a large, distributed application built on
microservices, spring boot, and Google Cloud may include: • Strong background in software
development and systems administration, as well as excellent problem-solving and
communication skills. • Run the production environment by monitoring availability and taking a
holistic view of system health. • Developing, improving, and operating the deployment and
orchestration of a complex distributed system • Improve reliability, quality, and time-to-market
of our suite of software solutions • Measure and optimize system performance, with an eye
toward pushing our capabilities forward, getting ahead of customer needs, and innovating to
continually improve • Provide primary operational and engineering Support for multiple large,
distributed software applications • Identify and reduce or eliminate toil via automation to
maximize the time spent on engineering and innovation • Collaborating with development
teams to design, build, and operate scalable and resilient software systems • Automating
deployment, monitoring, and incident response processes • Performing root cause analysis of
production incidents and implementing preventive measures • Conducting performance
analysis and optimization of the system • Ensuring compliance with security and regulatory
standards • Implementing and maintaining disaster recovery processes • Providing technical
guidance and mentorship to other team members • Participating in an on-call rotation for
incident response and support.
Skills Preferred:
• 5 - 6 years’ experience with JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure
& Docker/K8 in Maintenance and Development of multi-tier applications. • Understanding of
RESTful APIs and microservices platform • 4 - 5 Years of experience with any of APM and other
monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka,
DataDog, PagerDuty. • Strong experience with product & development teams to establish error
budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators),
KPIs (Key performance indicators) and effectively drive the use of the budget to ensure
maximum domain availability/uptime. • Regularly review key site technical metrics such as
transactions errors, logging, response times, caching strategies, conversion/bounce rates,
capacity & resource utilization. • Proactively identify stability risks & work with engineering
leadership to establish appropriate mitigation plans • Experience in solving complex
architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.
• Architect, design & develop automation to reduce toil, improve recoverability, availability,
latency & scalability of supported applications with understanding of MTTD (Mean Time to
Detection) & MTTR (Mean Time to Resolution) • Maintain knowledge repository that includes
Standard operating procedure, Release checklists, Runbooks for incident recovery


Experience Required
5-8 years experience working in Cloud technologies
(GCP preferred)

Education Required

4 Year College Degree in Computer Science or
Equivalent.
Education Preferred:
Certifications in Google Cloud (GCP)

Work Type: Work from Office / Hybrid (Chennai Location)

Notice Period : Immediate / 15 Days only

EMPLOYER: FASTSWITCH (US SOFTWARE COMPANY)

CLIENT: FORD CHENNAI

Dear Candidate,

If you are interested to apply for this job, kindly send your latest updated resume, Current CTC / Expected CTC / Notice Period, and your LinkedIn Account Profile to haridassharrissces@gmail.com or haridass@harrissces.com