I want to update a series of columns Country1
, Country2
... Country 9
based on a comma delimited string of country names in column Country.
Country1, Country2, Country3... this is an anti-pattern. You're feeling the consequences of very poor table design. Don't feel too put out, this is a very common anti-pattern. Lists are difficult and very non-intuitive to work with in standard SQL. The whole mess is referred to as "normalization".
Lists are represented by join tables in a one-to-many relationship. One t
has many countries. This is a very big topic to get into, but here's a sketch.
Instead of having Country1, Country2, Country3, etc... in whatever t
is, I'm going to call it Thing, you'd have a table called ThingCountries to represent the list.
create table ThingCountries (
ThingID integer references Thing(id),
Country text
);
Each Country belonging to a Thing would be inserted into ThingCountries along with the ID of the thing.
# Do this in a loop
insert into ThingCountries (ThingID, Country) values (?, ?)
They'd be retrieved with a join by linking Thing.ID and ThingCountries.ThingID.
select ThingCountries.Country from Thing
join ThingCountries on Thing.ID = ThingCountries.ThingID
where Thing.ID = ?
By querying ThingCountries you can quickly find out which Things have a certain Country.
select ThingID from ThingCountries
where Country = ?
They can be removed with a simple delete.
delete from ThingCountries where ThingID = ? and Country = ?
There's no need to know how many Country columns there are. There's no gaps to be filled in. There's no limit to how many Countries a Thing can have.
Later on down the road you might want to store information about each country, like its name and abbreviation. In which case you make a Country table.
create table Country (
id integer primary key,
name text not null,
abbrev text not null
);
And then ThingCountries references Country by is id rather than storing the Country name.
create table ThingCountries (
ThingID integer references Thing(id),
CountryID integer references Country(id)
);
Now you can store whatever information you want about each Country, and it protects against typos (because the Country has to exist in the Country table).
Do the same thing for Genre and your problem goes away.
It's a bit awkward, but that's how SQL does it. Best to get used to it. Alternatively some databases offer array types to make this simpler, like Postgres arrays.
More Reading:
You should run 2 separate update
queries. The problem is that Country
can be null independently of Genre
. So however you combine the conditions Country IS NOT NULL
and Genre IS NOT NULL
in the where
clause of the query, you will miss some columns that could have been updated, or you will update some columns with null
values.
Now, I haven't seen the implementation of returnCommaDelimitedValue
. It is possible that it returns null
when the string argument is null
. In that case, you might consider remove the where
clause completely and update the countryN
and genreN
columns in the same query. In that case, if Country
is null
, this will also make Country1
null, so this might be something that you want. If almost all rows have a non-null
Country
and Genre
, this approach might be faster.