2024.12.05 7일차 수업

문제 1) 20년 이상 근무한 사원들의 employee_id, salary, grade_level, 근무 연수, department_name, city를 출력하시오.

1. Oracle 방법

select e.employee_id, e.salary, trunc(months_between(sysdate, hire_date)/12) 근무연수, j.grade_level, d.department_name, l.city
from hr.employees e, hr.job_grades j, hr.departments d, hr.locations l
where months_between(sysdate, e.hire_date)/12 >= 20
and e.salary between j.lowest_sal and j.highest_sal
and e.department_id = d.department_id
and d.location_id = l.location_id;

2. ANSI 표준

select e.employee_id, e.salary, trunc(months_between(sysdate, hire_date)/12) 근무연수, j.grade_level, d.department_name, l.city
from hr.employees e join hr.job_grades j
on e.salary between j.lowest_sal and j.highest_sal
join hr.departments d
on e.department_id = d.department_id
join hr.locations l
on d.location_id = l.location_id
where months_between(sysdate, e.hire_date)/12 >= 20;

문제 2) 관리자보다 먼저 입사한 사원의 last_name, hire_date 및 해당 관리자의 last_name, hire_date 출력해 주세요.

1. Oracle 방법

select w.employee_id, w.last_name, w.hire_date, m.employee_id, m.last_name, m.hire_date
from hr.employees w, hr.employees m
where w.manager_id = m.employee_id
and w.hire_date < m.hire_date;

2. ANSI 표준

select w.employee_id, w.last_name, w.hire_date, m.employee_id, m.last_name, m.hire_date
from hr.employees w inner join hr.employees m
on w.manager_id = m.employee_id
and w.hire_date < m.hire_date;

ex) 110번 사원의 급여보다 더 많이 받는 사원 정보 추출

- 서브쿼리 맛보기

# 만약 서브쿼리를 작성 안 할 경우

select *
from hr.employees
where employee_id = 110;

select salary
from hr.employees
where salary > 8200;

▶ 되게 번거롭게 작업을 해야 한다.

# 서브쿼리 이용

select *
from hr.employees
where salary > (select salary
                 from hr.employees
                where employee_id = 110);

▶ 한 번에 수행할 수 있다.

■ subquery (서브쿼리)

- SQL문 안에 SELECT문을 서브쿼리라고 한다.

- SELECT문 안에 서브쿼리는 괄호()로 묶어야 한다.

■ 중첩 서브쿼리 (nested subquery)

- INNER QUERY (SUB QUERY) 먼저 수행

- INNER QUERY에서 수행한 값을 가지고 MAIN QUERY (OUTER QUERY) 수행한다.\

ex 1)

----------------------------- 이 부분 : MAIN QUERY, OUTER QUERY

select *
from hr.employees
where salary > (select salary
                       from hr.employees
                       where employee_id = 110);

---------------------------------------------- 이 부분 : INNER QUERY, SUB QUERY

ex 2)

select *
from hr.employees
where salary > (select salary
                       from hr.employees
                       where last_name = 'King');

▶ King이 들어가는 사원은 2명 이상이기 때문에 단일행 비교연산자인 >랑 비교할 수 없기 때문에 오류가 발생한다.

■ 단일행 서브쿼리

- 서브쿼리의 결과가 단일값이 나오는 서브쿼리

- 단일행 비교연산자 : =, >, >=, <, <=, <>, !=, ^=

ex 1) 직원ID 110인 직원과 동일한 직무를 가지고 있지만, 직원ID 110인 직원보다 더 높은 급여를 받는 모든 직원

select *
from hr.employees
where job_id = (select job_id
                        from hr.employees
                         where employee_id = 110)
and salary > (select salary
                   from hr.employees
                    where employee_id = 110);

ex 2) 사원 중에서 급여를 제일 많이 받는 사람

select *
from hr.employees
where salary = (select max(salary)
                from hr.employees);

▶ 서브쿼리 : hr.employees 테이블 안에 가장 높은 급여를 선택한다.

▶ 외부쿼리 : 각 직원의 급여가 서브쿼리에서 반환된 최대 급여와 같은지 확인한다.

▶ 예를 들어 가장 많이 받는 급여가 15,000이면 외부쿼리에 where salary = 15000이 되고 그 조건에 해당하는 사원의 정보를 추출하는 것이다.

ex 3) 오류 발생

select department_id, max(avg(salary))
from hr.employees
group by department_id; -- 오류발생

▶ 오류 발생 원인 : 집계 함수 내부에 다른 집계 함수를 포함할 수 없기 때문이다.

▶ avg 함수와 max 함수를 함께 사용하여 부서별로 평균 급여를 계산하고 그중 최대 평균 급여를 찾으려고 시도하지만

▶ SQL에서 집계함수는 각기 독립적으로 사용되어야 되기 때문에 서브쿼리를 사용해야 한다.

# 해결방법

select department_id, avg(salary)
from hr.employees
group by department_id
having avg(salary) = (select max(avg(salary))
                      from hr.employees
                      group by department_id);

■ 여러 행 서브쿼리

- 서브쿼리의 결과가 여러 개의 값이 나오는 서브쿼리

- 여러 행 비교연산자 : in, any, all

ex 1) 오류발생

select *
from hr.employees
where salary = (select max(salary)
                      from hr.employees
                      group by department_id);

▶ 오류 발생 원인 : 서브쿼리 내에 department_id는 여러 데이터가 들어가 있기 때문에 단일행 비교연산자인 = 을 쓰면 에러가 발생한다.

-> 해결방법

# in (=or)

select *
from hr.employees
where salary in(select max(salary)
                      from hr.employees
                      group by department_id);

▶ 단일행 비교연산자를 지우고 in 연산자를 통해 출력이 가능하다.

select *
from hr.employees
where salary > (select salary
                      from hr.employees
                      where job_id = 'IT_PROG');

▶ 오류 발생 원인 : 서브쿼리 내에 job_id가 IT_PROG인 데이터가 많은데 단일행 비교연산자(>)로 비교했기 때문에 에러가 발생

-> 해결방법

# any (or) : 서브쿼리 중에서 가장 최솟값을 찾아서 출력

- < any : 최댓값보다 작은 의미

- > any : 최솟값보다 큰 의미

- = any : in 연산자 의미

- 어느 하나라도, 적어도 1개, 어떤 것이든

select *
from hr.employees
where salary > any(select salary
                      from hr.employees
                      where job_id = 'IT_PROG');

▼ OR 범주이기 때문에 내부적으로 수행되는 것

select *
from hr.employees
where salary > 9000
or salary > 6000
or salary > 4800
or salary > 4800
or salary > 4200;

▶ any는 or이랑 같다 보면 되는데 어차피 4200보다 큰 값만 출력하면 되기 때문에 any를 쓴다.

# 위에 서브쿼리만 블록 잡아서 실행하면

select salary
from hr.employees
where job_id = 'IT_PROG'

▶ 서브쿼리 4200보다 큰 금액을 받는 사원들의 정보를 출력하는 쿼리이다.

# 만약 any를 안 쓸 경우 (min 붙인다.)

select *
from hr.employees
where salary > (select min(salary)
                      from hr.employees
                      where job_id = 'IT_PROG');

# all (and)

- 모든 조건 충족

- < all : 최솟값보다 작은 의미

- > all : 최댓값보다 큰 의미

- = all : 의미가 없고 쓰면 안 된다. (null 나옴)

ex)

select *
from hr.employees
where salary = all(select salary
                      from hr.employees
                      where job_id = 'IT_PROG');

 select *
from hr.employees
where salary = 9000
and salary = 6000
and salary = 4800
and salary = 4800
and salary = 4200;

ex 1)

select *
from hr.employees
where salary > all(select salary
                      from hr.employees
                      where job_id = 'IT_PROG');

▼ AND 범주이기 때문에 내부적으로 수행되는 것

select *
from hr.employees
where salary > 9000
and salary > 6000
and salary > 4800
and salary > 4800
and salary > 4200;

▶ 9000보다 큰 급여를 뽑아주면 된다.

# 만약 all을 안 쓸 경우 (현장에서 SQL개발자가 많이 쓴다.)

select *
from hr.employees
where salary > (select max(salary)
                      from hr.employees
                      where job_id = 'IT_PROG');

문제 1) location_id가 1700인 모든 사원들의 last_name, department_id, job_id를 출력하시오.

1. JOIN 이용

select e.last_name, e.department_id, e.job_id
from hr.employees e, hr.departments d
where e.department_id = d.department_id
and d.location_id = 1700;

2. 서브쿼리 이용

select last_name, department_id, job_id
from hr.employees 
where department_id in (select department_id
                       from hr.departments
                       where location_id = 1700)

문제 2) 관리자 사원들의 정보를 출력해 주세요.

ex 1) 오답 유형 1 - JOIN으로 풀 경우

select *
from hr.employees w join hr.employees m
on w.manager_id = m.employee_id;

▶ JOIN으로 풀게 되면 양쪽(m쪽 집합 X 1쪽 집합 = m쪽 집합) 사원수를 곱해서 나오기 때문에 값이 많이 나온다. 그래서 서브쿼리로 풀어줘야 한다.

ex 2) 오답 유형 2 - 위에 결과를 보면 중복된 manager_id도 출력이 된 걸 볼 수 있다. 이를 해결하려고 distinct or unique를 써서 결과가 제대로 나오는 경우도 있다.

select distinct m.*
from hr.employees w join hr.employees m
on w.manager_id = m.employee_id;

▶ 출력은 정상적으로 나왔다. 대신 악성 프로그램의 원인이 될 수 있기 때문에 쓰지 않는 게 맞다.

# 정답

select *
from hr.employees
where employee_id in (select manager_id
                        from hr.employees);

▶ 서브쿼리에 있는 hr.employees 테이블 안에 manager_id가 employee_id에 하나하나 대입하는 것이다.

문제 3) 관리자가 아닌 사원들의 정보를 출력하시오.

select *
from hr.employees
where employee_id not in (select manager_id
                        from hr.employees);

▶ 에러 뜨지 않고 비어 있다. 하지만 왜 안 나올까 궁금해해야 한다.

# 해결방법

select *
from hr.employees
where employee_id not in (select manager_id
                                     from hr.employees
                                     where manager_id is not null);

▶ not in 연산자를 사용할 때 서브쿼리에 null이 있으면 결과 집합이 null로 발생한다. 이유는 AND 진리표에 true and null(결측치, true/false가 들어올 수도 있다고 생각) = null이기 때문이다.

■ OR 진리표

	TRUE	FALSE	NULL
TRUE	TRUE	TRUE	TRUE
FALSE	TRUE	FALSE	NULL

■ AND 진리표

	TRUE	FALSE	NULL
TRUE	TRUE	FALSE	NULL
FALSE	FALSE	FALSE	FALSE

■ correlate subquery (상호관련 서브쿼리), 현장에서 많이 쓰기 때문에 이해해야 한다.

MAIN QUERY (OUTER QUERY) 먼저 수행
[첫 번째 행을 후보행으로 잡고] 후보행(미지수값)을 서브쿼리에 전달
후보행값을 사용해서 서브쿼리를 수행한다.
서브쿼리 결과값을 사용해서 후보행과 비교한 후 참이면 그 행을 결과집합에 저장(메모리)
다음행을 후보행으로 잡고 2번에 있는 후보행값을 서브쿼리에 3,4번 반복 수행한다.
문제점 : 후보행값이 들어올 때마다 평균을 계속 구해야 하기 때문에 악성 프로그램이 될 수 있다.

문제 ) 자신의 부서 평균 급여보다 더 많이 받는 사원들의 정보를 출력하세요.

select *
from hr.employees e
where salary > (select avg(salary)
                 from hr.employees
                 where department_id = e.department_id); 
                                    --------------------
                                      후보행값(미지수)

▶ 저렇게 실행하면 문제점 : 내부 서브쿼리가 외부쿼리의 각 행에 대해 실행한다. 이는 각 직원에 대해 해당 부서의 평균 급여를 계산하는 것이기 때문에, 데이터 양이 많아질수록 성능이 매우 저하될 수 있다.

▶ 해결방법 : 부서별 평균 급여 집합이라는 가상의 테이블 만들기

■ inline view

- 가상 테이블

- from 괄호 안에 select문(서브쿼리)

- 인라인 뷰를 이용해서 함수를 쓸 경우에는 별칭을 써줘야 오류가 안 발생한다.

- 인라인 뷰를 쓰는 이유 : 서브쿼리와 테이블 조인이 이루어질 경우 검색하는 데이터명(필드명)은 [테이블.필드명]과 같이 길어진다. 이때 서브쿼리를 통해 메인쿼리로 올라갈수록 쿼리의 길이가 점점 더 길어질 뿐 아니라 쿼리의 가독성이 떨어진다. 이때 FROM절에서 사용하는 인라인 뷰에 별칭을 줘서 간단하게 만들 수 있다.

ex) 인라인 뷰 구조

select e1.*, e2.*
from(select department_id, avg(salary) avg_sal
      from hr.employees
      group by department_id) e1, hr.employees e2
      where e1.department_id = e2.department_id
      and e2.salary > e1.avg_sal;

▶ 실제 DB테이블 안에는 부서별 평균 급여 집합이 없으니 가상 테이블을 만들어줘야 한다.

▶ 만들기 위해서 from () 안에 부서별 평균 급여 집합이 들어있는 가상 테이블을 만든다. (= 인라인 뷰)

▶ 그 가상 테이블에 e1이라는 별칭을 달아준 거다.

▶ avg_sal이라고 별칭을 주는 이유는 만약 별칭을 안 주게 되면 avg(salary)가 그대로 where 절에 입력해야 하는데 where절에 그룹함수가 들어가면 오류가 뜨기 때문에 별칭을 꼭 달아줘야 한다.

▶ 가상 테이블이기 때문에 실제 DB안에는 없는 테이블이다.

'oracle SQL' 카테고리의 다른 글

2024.12.09 9일차 수업 (4)	2024.12.09
2024.12.06 8일차 수업 (0)	2024.12.06
2024.12.04 6일차 수업 (1)	2024.12.04
2024.12.03 5일차 수업 (2)	2024.12.03
2024.12.02 4일차 수업 (4)	2024.12.02

영산 개발 공부

2024.12.05 7일차 수업

'oracle SQL' 카테고리의 다른 글

티스토리툴바

2024.12.05 7일차 수업

'oracle SQL' 카테고리의 다른 글

'oracle SQL' Related Articles

티스토리툴바