Friday, February 24, 2012

how to find conditions across rows (attendance)

Hello,
I need to find students that have 4 consecutive absences. When a
student is absent 4 times in a row, they can be dropped from the class.

My class attendance file contains each attendance by date and whether
they were present or not. When the student has 4 consecutive value 1
(absent) for a given session and a given class the are considered to be
dropped.
If I needed to know the total number of absences, I know I could group
and summarize, but this one has the consecutive twist.
Table:

CREATE TABLE "dbo"."clsatt"
("FULL_CLASS_ID" CHAR(15) NOT NULL,
"STUDENT_ID" CHAR(20) NULL,
"SESSION_ID" CHAR(10) NULL,
"MEETING" SMALLINT NOT NULL,
"PRESENT" CHAR(2) NOT NULL)
;
Present value of 1 is absent, value of 2 is present (3 means holiday)
Classes typically meet 12 times.
I would want something like
FULL_CLASS_ID, STUDENT_ID, SESSION_ID, 'Dropped'
as the output.

Notice in the example the first student was absent the last 4 meetings
The second student 5 absenses
and the third student was totally absent
In these three examples, they are flagged as dropped.
TIA
Rob
Inserts:

------------------------
insert into clsatt values ('BUS100','1675812194','200203',1,'2')
insert into clsatt values ('BUS100','1675812194','200203',2,'2')
insert into clsatt values ('BUS100','1675812194','200203',3,'2')
insert into clsatt values ('BUS100','1675812194','200203',4,'2')
insert into clsatt values ('BUS100','1675812194','200203',5,'2')
insert into clsatt values ('BUS100','1675812194','200203',6,'2')
insert into clsatt values ('BUS100','1675812194','200203',7,'2')
insert into clsatt values ('BUS100','1675812194','200203',8,'2')
insert into clsatt values ('BUS100','1675812194','200203',9,'1')
insert into clsatt values ('BUS100','1675812194','200203',10,'1')
insert into clsatt values ('BUS100','1675812194','200203',11,'1')
insert into clsatt values ('BUS100','1675812194','200203',12,'1')
insert into clsatt values ('BUS100','1712400537','200203',1,'2')
insert into clsatt values ('BUS100','1712400537','200203',2,'2')
insert into clsatt values ('BUS100','1712400537','200203',3,'2')
insert into clsatt values ('BUS100','1712400537','200203',4,'2')
insert into clsatt values ('BUS100','1712400537','200203',5,'2')
insert into clsatt values ('BUS100','1712400537','200203',6,'2')
insert into clsatt values ('BUS100','1712400537','200203',7,'2')
insert into clsatt values ('BUS100','1712400537','200203',8,'1')
insert into clsatt values ('BUS100','1712400537','200203',9,'1')
insert into clsatt values ('BUS100','1712400537','200203',10,'1')
insert into clsatt values ('BUS100','1712400537','200203',11,'1')
insert into clsatt values ('BUS100','1712400537','200203',12,'1')
insert into clsatt values ('BUS100','1801704805','200203',1,'1')
insert into clsatt values ('BUS100','1801704805','200203',2,'1')
insert into clsatt values ('BUS100','1801704805','200203',3,'1')
insert into clsatt values ('BUS100','1801704805','200203',4,'1')
insert into clsatt values ('BUS100','1801704805','200203',5,'1')
insert into clsatt values ('BUS100','1801704805','200203',6,'1')
insert into clsatt values ('BUS100','1801704805','200203',7,'1')
insert into clsatt values ('BUS100','1801704805','200203',8,'1')
insert into clsatt values ('BUS100','1801704805','200203',9,'1')
insert into clsatt values ('BUS100','1801704805','200203',10,'1')
insert into clsatt values ('BUS100','1801704805','200203',11,'1')
insert into clsatt values ('BUS100','1801704805','200203',12,'1')
insert into clsatt values ('BUS100','1922287588','200203',1,'1')
insert into clsatt values ('BUS100','1922287588','200203',2,'1')
insert into clsatt values ('BUS100','1922287588','200203',3,'2')
insert into clsatt values ('BUS100','1922287588','200203',4,'2')
insert into clsatt values ('BUS100','1922287588','200203',5,'2')
insert into clsatt values ('BUS100','1922287588','200203',6,'2')
insert into clsatt values ('BUS100','1922287588','200203',7,'2')
insert into clsatt values ('BUS100','1922287588','200203',8,'2')
insert into clsatt values ('BUS100','1922287588','200203',9,'2')
insert into clsatt values ('BUS100','1922287588','200203',10,'2')
insert into clsatt values ('BUS100','1922287588','200203',11,'1')
insert into clsatt values ('BUS100','1922287588','200203',12,'2')
insert into clsatt values ('BUS100','2188469657','200203',1,'1')
insert into clsatt values ('BUS100','2188469657','200203',2,'1')
insert into clsatt values ('BUS100','2188469657','200203',3,'2')
insert into clsatt values ('BUS100','2188469657','200203',4,'2')
insert into clsatt values ('BUS100','2188469657','200203',5,'2')
insert into clsatt values ('BUS100','2188469657','200203',6,'2')
insert into clsatt values ('BUS100','2188469657','200203',7,'2')
insert into clsatt values ('BUS100','2188469657','200203',8,'2')
insert into clsatt values ('BUS100','2188469657','200203',9,'1')
insert into clsatt values ('BUS100','2188469657','200203',10,'1')
insert into clsatt values ('BUS100','2188469657','200203',11,'1')
insert into clsatt values ('BUS100','2188469657','200203',12,'2')
insert into clsatt values ('BUS100','2515197431','200203',1,'1')
insert into clsatt values ('BUS100','2515197431','200203',2,'1')
insert into clsatt values ('BUS100','2515197431','200203',3,'2')
insert into clsatt values ('BUS100','2515197431','200203',4,'2')
insert into clsatt values ('BUS100','2515197431','200203',5,'1')
insert into clsatt values ('BUS100','2515197431','200203',6,'2')
insert into clsatt values ('BUS100','2515197431','200203',7,'2')
insert into clsatt values ('BUS100','2515197431','200203',8,'1')
insert into clsatt values ('BUS100','2515197431','200203',9,'2')
insert into clsatt values ('BUS100','2515197431','200203',10,'2')
insert into clsatt values ('BUS100','2515197431','200203',11,'1')
insert into clsatt values ('BUS100','2515197431','200203',12,'2')
select FULL_CLASS_ID,STUDENT_ID,SESSION_ID,'Dropped'
from (
select t1.FULL_CLASS_ID,
t1.STUDENT_ID,
t1.SESSION_ID,
t1.MEETING -
(select count(*) from clsatt t2
where t2.FULL_CLASS_ID=t1.FULL_CLASS_ID
and t2.STUDENT_ID=t1.STUDENT_ID
and t2.SESSION_ID=t1.SESSION_ID
and t2.MEETING<=t1.MEETING
and t2.PRESENT='1') as Rn
from clsatt t1
where t1.PRESENT='1') X
group by FULL_CLASS_ID,STUDENT_ID,SESSION_ID,Rn
having count(*)>= 4|||Mark,
Thanks much!
Rob|||I need to find students that have 4 consecutive absences. When a
student is absent 4 times in a row, they can be dropped from the class.

My class attendance file contains each attendance by date and whether
they were present or not. When the student has 4 consecutive value 1
(absent) for a given session and a given class the are considered to be
dropped.
If I needed to know the total number of absences, I know I could group
and summarize, but this one has the consecutive twist.
Table:

The tabel made no sense. NULL student ids of 20 characters in length?
Numeric attendance codes kept in CHAR(), you have no key? etc.

CREATE TABLE ClassAttendance
(class_name CHAR(15) NOT NULL,
student_id CHAR(20) NOT NULL,
session_id CHAR(10) NOT NULL,
meeting_nbr INTEGER NOT NULL,
attend_code INTEGER DEFAULT 1 NOT NULL
CHECK (attend_code IN (1,2,3)),
PRIMARY KEY (class_name, student_id, session_id, meeting_nbr));

Here is a shot using the new OLAP functions:

SELECT DISTINCT student_id, class_name, session_id
FROM (SELECT student_id, class_name, session_id,
SUM(attend_code)
OVER(PARTITION BY class_name, session_id
ORDER BY student_id, class_name, session_id
ROWS 4 PRECEDING)
FROM ClassAttendance
GROUP BY student_id, class_name, session_id)
AS X (student_id, class_name, session_id, last_four)
WHERE last_four = 4;|||thanks for your input Celko.

The table is only representative as to help me with the SQL, it is not
my student dimension as it is stored in my data warehouse.
The attendance code (PRESENT in my example) is an ID or attribute of
the attendance and not a fact. Therefore it is intended to be a char
type (or varchar). Summing on this field wont make sense from the
business. The only non-char type fields I keep are facts that can be
aggregated (tuition charges, payments and so forth).
Since this is a data warehouse intended to incorporate data from our
current operational systems as well as future currently unknown
systems, I need the flexibility that char provides. Another system
might have alpha-numeric student ID's.
I use DataManager from Cognos to build my warehouse, which has PK
information, so I have not defined in the database PKs (yet).
When I post for help, I don't expect to get exact SQL for my exact
problem; therefore I post what is representative of my problem. Also,
the solutions and ideas people post sometimes don't do what I need,
but provide me valuable knowledge to help solve future problems.
However in this case markc nailed it and I was able to solve my problem
and learn something new!
As for learning something new, I will try your example and see what it
does. Thanks for your input!
Thanks
Rob|||SQL Server 2005 doesn't support SUM()..OVER(ORDER BY..ROWS 4 PRECEDING)
However, you can do this

select FULL_CLASS_ID,STUDENT_ID,SESSION_ID,'Dropped'
from (
select FULL_CLASS_ID,
STUDENT_ID,
SESSION_ID,
MEETING - RANK() OVER(PARTITION BY
FULL_CLASS_ID,STUDENT_ID,SESSION_ID
ORDER BY MEETING) as Rn
from clsatt
where PRESENT='1') X
group by FULL_CLASS_ID,STUDENT_ID,SESSION_ID,Rn
having count(*)>= 4

Regards

Mark|||Anohter answer: Since there are only a few ranges, we can build an
auxiliary table and use it:

CREATE TABLE FourRanges
(start_session_nbr INTEGER NOT NULL,
end_session_nbr INTEGER NOT NULL,
CHECK (start_session_nbr < end_session_nbr));

INSERT INTO FourRanges VALUES (1, 4);
INSERT INTO FourRanges VALUES (2, 5);
INSERT INTO FourRanges VALUES (3, 6);
INSERT INTO FourRanges VALUES (4, 7);
..
INSERT INTO FourRanges VALUES (9, 12);

SELECT A1.course_name, A1.session_id, A1.student_id,
SUM(attend_code)
FROM ClassAttendance AS A1, FourRanges AS F
WHERE A1.session_nbr BETWEEN F.start_session_nbr
AND F.end_session_nbr
GROUP BY course_name, session_id, student_id
HAVING SUM(attend_code) = 4;

No comments:

Post a Comment