Monday, March 26, 2012

How to flag a row and allow max one row flagged in the table.

Hi,

I am new to SQL.

which is the best way to flag a row?

Create a bit column IsFlagged?

Additionally, I want to create a table-level constraint that allows max 1 row flagged true in the whole table. I can't nest a select expression in the CREATE TABLE statement CONSTRAINT clause that counts flagged rows.

So how do I do this?

Appreciate the help.

you can do something like this -

create table FlaggableDomain(ID int unique not null, Name sysname primary key not null)

populated something like this

ID DomainName
1 'person'
2 'place'
3 'thing'

track your flags with this -

create table FlaggedDomain (ItemID int not null, DomainID int not null, primary key (ItemID, DomainID))

to flag a row Person row -


insert into FlaggedDomain (ItemID, DomainId)
select person.id, domain.ID from
person
where name = 'SOME NAME' and
FlaggableDomain.DomainName = 'person'

to clear -

delete from FlaggedDomain where DomainId in
( select idFlaggableDomain
from FlaggableDomain
where DomainName = 'person')

to get flagged person -

select person.*
from person p inner join FlaggedDomain f
on p.id = f.ItemID

|||

Thanks for taking the time Blair,

Let me summarize what I made out of your reply:

1.Create a table of flags(FlaggableDomain)
2. Make a linking table(FlaggedDomain) linking flags with any other table(person), whose rows I want to flag.
3. By having a composite primary key for the linking table(FlaggableDomain) you ensure the person table can only have one row that has the flag 'person'.

This definitely is a solution, and probably the most professional one. However, I now face having to do 2 extra tables whenever I want to implement flagging logic, or have a central depository of many dissimilar flags and always having to join to this depository to be able to use the flags. In the former a table bloat, in the latter clunky code. Not quite like an enumeration in vb/c# which groups constants,hopefully, logically, depending on the author :).

There is one nice feature about your way, that is in a table that can have mixed rows of things/persons/places, more than one type of flag can be stored in the same column. Thus I can make sure max 1 'thing', max 1 'place', and max 1 'person' rows exist in that mixed table. I don't have that need now but it is a neat way.

A not so nice feature: not just any table's rows can be flagged this way. Only tables which have a simple(1 column), int primary key, which is the most common one but there are exceptions and they wouldn't be flaggable this way.

What about table level constraint?
I know I couldn't nest a select statement (and count) inside the constraint expression. Are there any options along those lines?

Forgive me if I am discovering the wheel out loud.

|||

However, I now face having to do 2 extra tables whenever I want to implement flagging logic, or have a central depository of many dissimilar flags and always having to join to this depository to be able to use the flags. In the former a table bloat, in the latter clunky code.

the code is not clunky at all. . . make views from simple select statements.

Always bear in mind that a primary key is a 'tuple' that defines a unique entity within a domain. Whereever possible, it should be a piece of data that you can look at without knowledge of the db and know what it is (natural key) . And as you noted, often times the primary key is compound and creating a foreign key reference would be unwieldly. When this is the case, the approach is to use what is called a unique, non-null "surrogate key" and propagate that value in foreign key references. You might find it useful to use uniqueidentifier fields. UniqueID's are cheap to create and generatate.

Tables are cheap to create too. Joins are cheap too.

Databases are geared to work with them - as long as the indexes are properly defined.

I am not sure if a table constraint is possible. Even if it were, it would take a total table scan to assure that the constraint we enforced.

google "Surrogate Key"

|||

I looked into surrogate keys and will definitely rethink doing composite keys in the future.

Learned a lot from you!

Thanks a bunch.

|||

cheers!

|||

Blair,

Can I milk you for a bit more advice?

Is this type of flagging appropriate to implement data versioning?

say a sports league. A player can be active in just one team, but I want to keep prior team membership in a foreign key table. Only one of those team memberships would be flagged 'current' as we discussed. The rest are just stale versions, but I want to keep that history data.

When the player moves to another team, it would have to be a transaction removing the flag from the old team, and flagging the new one?

Is this the right way of doing versioning?

Carl

|||

I don't know if versioning is the right term. I would consider this a Transaction. Note: i am using this in the natural sense of the word, not database concurrency management.

Legend: Table(KeyFields, SurrogateKeyField, SupplementalField), with foreign keys marked by Field -> Table(KeyField) and I am using integer ID's for brevity. ID data type is irrelevant with the condition that it is easily generated.

First, Active Status on a team would be enforced by Roster(PlayerID, TeamID)

Next, I would have a table TransactionType(TransID, TransName, TransAbbr)

this would be populated by something like:

1, Draft, DR
2, Free-Agent, FA
3, Injured, IR
4, Disabled, DL
5, Trade Out, TO
6, Trade In, TI
7, Released, RE

I would create a PlayerHistory table to track the status changes of players -

PlayerHistory(HistoryID, PlayerID -> Player(PlayerID), TeamID -> Team(TeamID), TransID -> TransactionType(TransID), DateOfEntry , RelatedHistory -> PlayerHistory(HistoryID))

Note the self reference on player history, this would be used for trades - Each trade out has a related trade in. For all other entries in a players history, the related history would be null.

Say team1 traded player 1 to team 2 for player 2 on 1/1/2006 -

1, 1, 1, 5, 1/1/2006, 2
2, 1, 2, 6, 1/1/2006, 1
3, 2, 2, 5, 1/1/2006, 4
4, 2.,1, 6, 1/1/2006, 3

This could get more elaborate - TradeMaster (TradeID, AgreementDocument)

TradeHistory(TradeID, PlayerHistoryID -> PlayerHistory(HistoryID))

This could be used to track the Trade-Outs that were affected by the aggreement. From our example above:

TradeMaster might be:
1, "Team 1 will give Team 2 a first round draft pick next year"

And TradeHistory would simply be:
1, 1
1, 3

You could trigger off of TradeHistory to automatically insert PlayerHistory and update Roster

Now you might be thinking, as you said "Table Bloat." Tables and Disk-Space is cheap. What is expensive is processor time! You want to minimize the amount of clock cycles that need to be executed, both in the client application or the database server. Inserting or changing data that is indexed or constrained is expensive. Get it in the table in a way that minimizes changes to indexed fields. If you have to manipulate data that is indexed or constrained, you want to do it once, if possible.

Finally, it only takes the execution of 3 instructions to retrieve a set of rows based on index. Indexes on bit fields give no benefit.

Other things to google "Normal-Form" and "DKNF" - Note: DKNF is an ideal, strive for it, but it is not always feasible.

Database Theory is as much an art as it is a science. People get PhD's in Database Therory, but there is no "Right" model for a given Domain Set, but there are models that are more proper than others. I liken it to the game othello, "A minute to learn, a lifetime to master."

Oh yeah! Don't forget to eat and sleep! (Yeah, as if you will be able to sleep once you get started thinking about your data model)

Don't get me started on the term "Normal" and it roots in mathematics.

Good luck and have fun.

|||

Thanks Blair,

You answered even my next couple of questions.

Good advice re the sleep bit!

Cheers

No comments:

Post a Comment